Method and apparatus for measuring asymmetry of a microstructure, position measuring method, position measuring apparatus, lithographic apparatus and device manufacturing method

ABSTRACT

A lithographic apparatus includes a sensor, such as an alignment sensor including a self-referencing interferometer, configured to determine the position of an alignment target including a periodic structure. An illumination optical system focuses radiation of different colors and polarizations into a spot which scans the structure. Multiple position-dependent signals are detected and processed to obtain multiple candidate position measurements. Asymmetry of the structure is calculated by comparing the multiple position-dependent signals. The asymmetry measurement is used to improve accuracy of the position read by the sensor. Additional information on asymmetry may be obtained by an asymmetry sensor receiving a share of positive and negative orders of radiation diffracted by the periodic structure to produce a measurement of asymmetry in the periodic structure.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims the benefit of U.S. provisional application61/722,671, which was filed on 5 Nov. 2012, and which is incorporatedherein in its entirety by reference.

FIELD

The present invention relates to an improved apparatus and method tomeasure a property or position of a structure. The invention in otheraspects provides a lithographic apparatus and device manufacturingmethod, and also an optical element.

BACKGROUND

A lithographic apparatus is a machine that applies a desired patternonto a substrate, usually onto a target portion of the substrate. Alithographic apparatus can be used, for example, in the manufacture ofintegrated circuits (ICs). In that instance, a patterning device, whichis alternatively referred to as a mask or a reticle, may be used togenerate a circuit pattern to be formed on an individual layer of theIC. This pattern can be transferred onto a target portion (e.g.comprising part of, one, or several dies) on a substrate (e.g. a siliconwafer). Transfer of the pattern is typically via imaging onto a layer ofradiation-sensitive material (resist) provided on the substrate. Ingeneral, a single substrate will contain a network of adjacent targetportions that are successively patterned. Known lithographic apparatusinclude so-called steppers, in which each target portion is irradiatedby exposing an entire pattern onto the target portion at one time, andso-called scanners, in which each target portion is irradiated byscanning the pattern through a radiation beam in a given direction (the“scanning”-direction) while synchronously scanning the substrateparallel or anti-parallel to this direction. It is also possible totransfer the pattern from the patterning device to the substrate byimprinting the pattern onto the substrate.

In order to control the lithographic process to place device featuresaccurately on the substrate, one or more alignment marks are generallyprovided on, for example, the substrate, and the lithographic apparatusincludes one or more alignment sensors by which the position of the markmay be measured accurately. The alignment sensor may be effectively aposition measuring apparatus. Different types of marks and differenttypes of alignment sensors are known from different times and differentmanufacturers. A type of sensor known for a lithographic apparatus isbased on a self-referencing interferometer as described in U.S. Pat. No.6,961,116, the contents of which is incorporated herein in its entiretyby reference. Generally a mark is measured separately to obtain X- andY-positions. A combined X- and Y-measurement can be performed using oneor more the techniques described in U.S. patent application publicationno. US US 2009/0195768, the contents of which is incorporated herein inits entirety by reference.

SUMMARY

Advanced alignment techniques using an alignment sensor are described byJeroen Huijbregtse et al. in “Overlay Performance with Advanced ATHENA™Alignment Strategies”, Metrology, Inspection, and Process Control forMicrolithography XVII, Daniel J. Herr, Editor, Proceedings of SPIE Vol.5038 (2003). These strategies can be extended and applied in sensors ofthe type described by U.S. Pat. No. 6,961,116 and US 2009/0195768,mentioned above. A feature of a sensor is that it may measure positionusing several wavelengths (e.g., colors) and polarizations of radiation(e.g., light) on the same target grating or gratings. No single color isideal for measuring in all situations, so the system selects from anumber of signals, which one provides the most reliable positioninformation.

There is continually a need to provide more accurate positionmeasurement, especially to control the overlay error as product featuresget smaller and smaller. A cause of error in alignment may be asymmetryin the features making up a mark, which may be caused for example byprocessing to apply one or more subsequent product layers. A metrologytool such as a scatterometer exist that can measure asymmetry and one ormore other parameters of a microstructure. Such a tool could be appliedin principle to measure and correct for asymmetry or the otherparameter. In practice, however, such a tool may not operate with highthroughput desired in the alignment task for high-volume lithographicproduction. Such a tool may additionally or alternatively beincompatible with the alignment environment in terms of its bulk, massor power dissipation.

In a broad aspect, the invention aims to provide an alternative methodand apparatus for the measurement of asymmetry (or more generally, oneor more asymmetry dependent parameters) in a microstructure.

In a further aspect, the invention aims to provide an improved positionmeasurement apparatus, for example an alignment sensor in a lithographicapparatus, that is able to correct for the influence of mark asymmetryon position measurement. In that regard, there is provided, in anembodiment, a method of measuring asymmetry that can be applied tomeasuring asymmetry in an alignment mark simultaneously with positionmeasurement from that mark, without unduly reducing throughput of analignment system. Further, in an embodiment, there is provided a methodthat employs signals already captured as part of the position measuringtask.

According to an aspect, there is provided a method of measuring aproperty of a structure on, for example, a substrate, for example anasymmetry-related parameter, the method comprising:

(a) illuminating the structure with radiation and detecting radiationdiffracted by the structure using one or more detectors;(b) processing signals representing the diffracted radiation to obtain aplurality of results related to a position of the structure, each resulthaving the same form but being influenced in a different way byvariation in the property;(c) calculating a measurement of the property of the structure that isat least partially based on a difference or differences observed amongthe plurality of results.

In an embodiment, the plurality of results includes results based onillumination and detection of radiation at different wavelengths,different polarizations and/or different spatial frequencies within aposition-dependent signal received by one detector. The differencesbetween results used in the calculating step (c) do not have to beexpressed in a particular form such as simple subtraction. Differencesbetween results can be expressed in any suitable form.

In an embodiment, the measurement calculated in step (c) includes one ormore further results, for example one or more other results obtainedusing radiation diffracted by the structure, but not related to theposition of the structure. The other result may be obtained for exampleusing another detector processing a different portion of the radiationdiffracted by the structure at the same time as the detection in step(b). The other result may alternatively or in addition include a resultobtained from the same signals as the results related to the position ofthe structure.

According to an aspect, there is provided a method of measuring theposition of a periodic structure on, for example, a substrate, themethod comprising measuring a property of the structure using a methoddescribed above, and further comprising: (d) calculating a measurementof the position of the structure using one or more of the resultsobtained in step (b) and corrected in accordance with the measurement ofthe property obtained in step (c).

According to an aspect, there is provided a method of manufacturingdevices wherein a device pattern is applied to a substrate using alithographic process, the method including positioning the appliedpattern by reference to a measured position of one or more periodicstructures formed on the substrate, the measured position being obtainedby a method as described herein.

According to an aspect, there is provided a lithographic apparatuscomprising:

a patterning subsystem for transferring a pattern to a substrate;

a measuring subsystem for measuring positions of the substrate inrelation to the patterning subsystem,

wherein the patterning subsystem is arranged to use the positionsmeasured by the measuring subsystem to apply the pattern at a desiredposition on the substrate and wherein the measuring subsystem isarranged to measure the positions of the substrate using one or moreperiodic structures provided on the substrate and measuring thepositions of the structure using a method as described herein.

According to an aspect, there is provided an apparatus for measuring theposition of a structure on, for example, a substrate, the apparatuscomprising:

an illuminating arrangement for illuminating the structure withradiation;

a detecting arrangement for detecting radiation diffracted by thestructure using one or more detectors;

a processing arrangement for processing signals representing thediffracted radiation to obtain a plurality of results related to aposition of the structure, each result having the same form but beinginfluenced in a different way by variation in a property of thestructure; and

a calculating arrangement for calculating a position of the structureusing one or more of the results obtained by the processing arrangement,

wherein the calculating arrangement is arranged to include a correctionin the calculated position in accordance with a measurement of theproperty of the structure, and wherein the calculating arrangement isarranged to calculate the measurement of the property of the structureat least partially on the basis of a difference observed among theplurality of results.

Embodiments of the invention enable measurements of a property, forexample asymmetry, to be obtained or refined using information that isordinarily captured by a sensor, but not ordinarily exploited. Theplurality of results may, for example, include results based ondifferent wavelengths, different polarizations, different spatialfrequencies (diffraction orders), or one or more combinations of these.The method can be used in combination with more measurements of theproperty made by other means.

BRIEF DESCRIPTION OF THE DRAWINGS

Embodiments of the invention will now be described, by way of exampleonly, with reference to the accompanying schematic drawings in which:

FIG. 1 depicts an exemplary lithographic apparatus including analignment sensor as a measuring apparatus according to an embodiment ofthe invention;

FIG. 2, comprising FIG. 2( a) and FIG. 2( b), illustrates various formsof an alignment mark that may be provided in or on, for example, asubstrate in the apparatus of FIG. 1;

FIG. 3 is a schematic block diagram of an alignment sensor scanning analignment mark in the apparatus of FIG. 1;

FIG. 4 is a more detailed schematic diagram of an alignment sensorsuitable for use in an embodiment of the present invention and useableas the alignment sensor in the apparatus of FIG. 1, and includingoff-axis illumination and an optional asymmetry measuring arrangement;

FIG. 5 illustrates (a) an on-axis illumination profile, (b) resultingdiffraction signals, and (c) resulting self-referencing interferometeroutput for a single wavelength of radiation in one use of the positionmeasuring apparatus of FIG. 4;

FIG. 6 illustrates (a) an off-axis illumination profile, (b) resultingdiffraction signals, and (c) resulting self-referencing interferometeroutput in one use of the position measuring apparatus of FIG. 4;

FIG. 7 illustrates (a) an on-axis illumination profile, (b) resultingdiffraction signals, and (c) resulting self-referencing interferometeroutput for multiple wavelengths of radiation in one use of the positionmeasuring apparatus of FIG. 4;

FIG. 8 illustrates (a) an off-axis illumination profile, (b) resultingdiffraction signals, and (c) resulting self-referencing interferometeroutput for multiple wavelengths of radiation in one use of the positionmeasuring apparatus of FIG. 4;

FIG. 9 is a further detailed schematic diagram of the apparatus of FIG.4, showing features of multiple wavelengths and polarization, omittedfrom FIG. 4 for clarity;

FIG. 10 is a flowchart of a method of measuring asymmetry and measuringposition, according to an embodiment of the present invention;

FIG. 11 is a more detailed flowchart of part of the method of FIG. 10showing more detail of measuring asymmetry using position measurementsignals in the apparatus of FIGS. 4 and 9;

FIGS. 12 to 18 define coordinate systems and notation used in themathematical description of the embodiment, including the influence oftarget tilt on polarization coordinate systems; and

FIG. 19 defines parameters of a model target structure that may be usedin an embodiment.

DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS

FIG. 1 schematically depicts a lithographic apparatus according to oneembodiment of the invention. The apparatus comprises:

an illumination system (illuminator) IL configured to condition aradiation beam B (e.g. UV radiation or EUV radiation).

a support structure (e.g. a mask table) MT constructed to support apatterning device (e.g. a mask) MA and connected to a first positionerPM configured to accurately position the patterning device in accordancewith certain parameters;

a substrate table (e.g. a wafer table) WTa or WTb constructed to hold asubstrate (e.g. a resist-coated wafer) W and connected to a secondpositioner PW configured to accurately position the substrate inaccordance with certain parameters; and

a projection system (e.g. a refractive projection lens system) PSconfigured to project a pattern imparted to the radiation beam B bypatterning device MA onto a target portion C (e.g. comprising one ormore dies) of the substrate W.

The illumination system may include various types of optical components,such as refractive, reflective, magnetic, electromagnetic, electrostaticor other types of optical components, or any combination thereof, fordirecting, shaping, or controlling radiation.

The support structure holds the patterning device in a manner thatdepends on the orientation of the patterning device, the design of thelithographic apparatus, and other conditions, such as for examplewhether or not the patterning device is held in a vacuum environment.The support structure can use mechanical, vacuum, electrostatic or otherclamping techniques to hold the patterning device. The support structuremay be a frame or a table, for example, which may be fixed or movable asrequired. The support structure may ensure that the patterning device isat a desired position, for example with respect to the projectionsystem. Any use of the terms “reticle” or “mask” herein may beconsidered synonymous with the more general term “patterning device.”

The term “patterning device” used herein should be broadly interpretedas referring to any device that can be used to impart a radiation beamwith a pattern in its cross-section such as to create a pattern in atarget portion of the substrate. It should be noted that the patternimparted to the radiation beam may not exactly correspond to the desiredpattern in the target portion of the substrate, for example if thepattern includes phase-shifting features or so called assist features.Generally, the pattern imparted to the radiation beam will correspond toa particular functional layer in a device being created in the targetportion, such as an integrated circuit.

The patterning device may be transmissive or reflective. Examples ofpatterning devices include masks, programmable mirror arrays, andprogrammable LCD panels. Masks are well known in lithography, andinclude mask types such as binary, alternating phase-shift, andattenuated phase-shift, as well as various hybrid mask types. An exampleof a programmable mirror array employs a matrix arrangement of smallmirrors, each of which can be individually tilted so as to reflect anincoming radiation beam in different directions. The tilted mirrorsimpart a pattern in a radiation beam which is reflected by the mirrormatrix.

The term “projection system” used herein should be broadly interpretedas encompassing any type of projection system, including refractive,reflective, catadioptric, magnetic, electromagnetic and electrostaticoptical systems, or any combination thereof, as appropriate for theexposure radiation being used, or for other factors such as the use ofan immersion liquid or the use of a vacuum. Any use of the term“projection lens” herein may be considered as synonymous with the moregeneral term “projection system”.

As here depicted, the apparatus is of a transmissive type (e.g.employing a transmissive mask). Alternatively, the apparatus may be of areflective type (e.g. employing a programmable mirror array of a type asreferred to above, or employing a reflective mask).

The lithographic apparatus may be of a type having two (dual stage) ormore substrate tables (and/or two or more patterning device tables). Insuch “multiple stage” machines the additional tables may be used inparallel, or preparatory steps may be carried out on one or more tableswhile one or more other tables are being used for exposure. The twosubstrate tables WTa and WTb in the example of FIG. 1 are anillustration of this. The invention disclosed herein can be used in astand-alone fashion, but in particular it can provide additionalfunctions in the pre-exposure measurement stage of either single- ormulti-stage apparatuses.

The lithographic apparatus may also be of a type wherein at least aportion of the substrate may be covered by a liquid having a relativelyhigh refractive index, e.g. water, so as to fill a space between theprojection system and the substrate. An immersion liquid may also beapplied to other spaces in the lithographic apparatus, for example,between the mask and the projection system Immersion techniques are wellknown in the art for increasing the numerical aperture of projectionsystems. The term “immersion” as used herein does not mean that astructure, such as a substrate, must be submerged in liquid, but ratheronly means that liquid is located between the projection system and thesubstrate during exposure.

Referring to FIG. 1, the illuminator IL receives a radiation beam from aradiation source SO. The source and the lithographic apparatus may beseparate entities, for example when the source is an excimer laser. Insuch cases, the source is not considered to form part of thelithographic apparatus and the radiation beam is passed from the sourceSO to the illuminator IL with the aid of a beam delivery system BDcomprising, for example, suitable directing mirrors and/or a beamexpander. In other cases the source may be an integral part of thelithographic apparatus, for example when the source is a mercury lamp.The source SO and the illuminator IL, together with the beam deliverysystem BD if required, may be referred to as a radiation system.

The illuminator IL may comprise an adjuster AD for adjusting the angularintensity distribution of the radiation beam. Generally, at least theouter and/or inner radial extent (commonly referred to as σ-outer andσ-inner, respectively) of the intensity distribution in a pupil plane ofthe illuminator can be adjusted. In addition, the illuminator IL maycomprise various other components, such as an integrator IN and acondenser CO. The illuminator may be used to condition the radiationbeam, to have a desired uniformity and intensity distribution in itscross-section.

The radiation beam B is incident on the patterning device (e.g., mask)MA, which is held on the support structure (e.g., mask table) MT, and ispatterned by the patterning device. Having traversed the patterningdevice MA, the radiation beam B passes through the projection system PS,which focuses the beam onto a target portion C of the substrate W. Withthe aid of the second positioner PW and position sensor IF (e.g. aninterferometric device, linear encoder or capacitive sensor), thesubstrate table WTa/WTb can be moved accurately, e.g. so as to positiondifferent target portions C in the path of the radiation beam BSimilarly, the first positioner PM and another position sensor (which isnot explicitly depicted in FIG. 1) can be used to accurately positionthe patterning device MA with respect to the path of the radiation beamB, e.g. after mechanical retrieval from a mask library, or during ascan. In general, movement of the support structure MT may be realizedwith the aid of a long-stroke module (coarse positioning) and ashort-stroke module (fine positioning), which form part of the firstpositioner PM. Similarly, movement of the substrate table WTa/WTb may berealized using a long-stroke module and a short-stroke module, whichform part of the second positioner PW. In the case of a stepper (asopposed to a scanner) the support structure MT may be connected to ashort-stroke actuator only, or may be fixed. Patterning device MA andsubstrate W may be aligned using patterning device alignment marks M1,M2 and substrate alignment marks P1, P2. Although the substratealignment marks as illustrated occupy dedicated target portions, theymay be located in spaces between target portions (these are known asscribe-lane alignment marks). Similarly, in situations in which morethan one die is provided on the patterning device MA, the patterningdevice alignment marks may be located between the dies.

The depicted apparatus could be used in at least one of the followingmodes:

1. In step mode, the support structure MT and the substrate tableWTa/WTb are kept essentially stationary, while an entire patternimparted to the radiation beam is projected onto a target portion C atone time (i.e. a single static exposure). The substrate table WTa/WTb isthen shifted in the X and/or Y direction so that a different targetportion C can be exposed. In step mode, the maximum size of the exposurefield limits the size of the target portion C imaged in a single staticexposure.2. In scan mode, the support structure MT and the substrate tableWTa/WTb are scanned synchronously while a pattern imparted to theradiation beam is projected onto a target portion C (i.e. a singledynamic exposure). The velocity and direction of the substrate tableWTa/WTb relative to the support structure MT may be determined by the(de-)magnification and image reversal characteristics of the projectionsystem PS. In scan mode, the maximum size of the exposure field limitsthe width (in the non-scanning direction) of the target portion in asingle dynamic exposure, whereas the length of the scanning motiondetermines the height (in the scanning direction) of the target portion.3. In another mode, the support structure MT is kept essentiallystationary holding a programmable patterning device, and the substratetable WTa/WTb is moved or scanned while a pattern imparted to theradiation beam is projected onto a target portion C. In this mode,generally a pulsed radiation source is employed and the programmablepatterning device is updated as required after each movement of thesubstrate table WTa/WTb or in between successive radiation pulses duringa scan. This mode of operation can be readily applied to masklesslithography that utilizes programmable patterning device, such as aprogrammable mirror array of a type as referred to above.

Combinations and/or variations on the above described modes of use orentirely different modes of use may also be employed.

Lithographic apparatus LA is of a so-called dual stage type which hastwo tables WTa and WTb and two stations an exposure station and ameasurement station between which the tables can be exchanged. Forexample, while one substrate on one substrate table is being exposed atthe exposure station, another substrate can be loaded onto the othersubstrate table at the measurement station so that various preparatorysteps may be carried out. In an embodiment, one table is a substratetable and another table is a measurement table including one or moresensors. Preparatory steps may be performed at the measurement stationsuch as mapping the surface of the substrate using a level sensor LSand/or measuring the position of one or more alignment markers on, forexample, the substrate using an alignment sensor AS. Such preparatorysteps enable a substantial increase in the throughput of the apparatus.If the position sensor IF is not capable of measuring the position ofthe table while it is at the measurement station as well as at theexposure station, a second position sensor may be provided to enable thepositions of the table to be tracked at both stations.

The apparatus further includes a lithographic apparatus control unitLACU which controls the movements and measurements of the variousactuators and sensors described. Control unit LACU also includes signalprocessing and data processing capacity to implement desiredcalculations relevant to the operation of the apparatus. In practice,control unit LACU may be realized as a system of many sub-units, eachhandling the real-time data acquisition, processing and control of asubsystem or component within the apparatus. For example, one processingsubsystem may be dedicated to servo control of the positioner PW.Separate units may even handle coarse and fine actuators, or differentaxes. Another unit might be dedicated to the readout of the positionsensor IF. Overall control of the apparatus may be controlled by acentral processing unit, communicating with these sub-systems processingunits, with operators and with other apparatuses involved in thelithographic manufacturing process.

FIG. 2( a) shows examples of alignment marks 202, 204, provided onsubstrate W for the measurement of X-position and Y-position,respectively. Each mark in this example comprises a series of barsformed in a product layer or other layer applied to or etched into, forexample, the substrate. The bars are regularly spaced and act as gratinglines so that the mark can be regarded as a diffraction grating with asufficiently well-known spatial period (pitch). The bars on theX-direction mark 202 are substantially parallel to the Y-axis to provideperiodicity in the X-direction, while the bars of the Y-direction mark204 are substantially parallel to the X-axis to provide periodicity inthe Y-direction. Marks can be periodic in one direction only, or in morethan one direction at the same time. The skilled reader will appreciatethat a mark in practice may not be perfectly periodic, as imperfectionssuch as line edge roughness are present in any real structure. Thealignment sensor AS (shown in FIG. 1) scans each mark optically with aspot 206 (X direction), 208 (Y direction) of radiation, to obtain aperiodically-varying signal, such as a sine wave. The phase of thissignal is analyzed, to measure the position of the mark, and hence of,for example, substrate W, relative to the alignment sensor, which inturn is fixed relative to the reference frame RF of the apparatus. Thescanning movement is indicated schematically by a broad arrow, withprogressive positions of the spot 206 or 208 indicated in dottedoutline. The pitch of the bars (grating lines) in the alignment patternis typically much greater than the pitch of product features to beformed on the substrate, and the alignment sensor AS uses a wavelengthof radiation (or usually plural wavelengths) much longer than theexposure radiation to be used for applying a pattern to the substrate.Fine position information can be obtained because the large number ofbars allows the phase of a repeating signal to be accurately measured.

Coarse and fine marks may be provided, so that the alignment sensor candistinguish between different cycles of the periodic signal, as well asthe exact position (phase) within a cycle. Marks of different pitchescan also be used for this purpose. These techniques are known to theperson skilled in the art, and will not be detailed herein. The designand operation of such a sensor is known in the art, and eachlithographic apparatus may have its own design of sensor. For thepurpose of the present description, it will be assumed that thealignment sensor AS is generally of the form described in U.S. Pat. No.6,961,116. FIG. 2( b) shows a modified mark for use with a similaralignment system, in which X- and Y-positions can be obtained through asingle optical scan with the illumination spot 206 or 208. The mark 210has bars arranged at substantially 45 degrees to both the X- and Y-axes.This combined X- and Y-measurement can be performed using the techniquesdescribed in U.S. patent application publication no. US 2009/0195768.

FIG. 3 is a schematic block diagram of an alignment sensor AS.Illumination source 220 provides a beam 222 of radiation of one or morewavelengths, which is diverted by a spot mirror 223 through an objectivelens 224 onto a mark, such as mark 202, located on substrate W. Asindicated schematically in FIG. 2, in the example of the presentalignment sensor based on U.S. Pat. No. 6,961,116, mentioned above, theillumination spot 206 by which the mark 202 is illuminated may beslightly smaller in width (e.g., diameter) than the width of the markitself.

Radiation scattered by mark 202 is picked up by objective lens 224 andcollimated into an information-carrying beam 226. A self-referencinginterferometer 228, such as of the type disclosed in U.S. Pat. No.6,961,116 mentioned above, processes beam 226 and outputs separate beams(for each wavelength) onto a sensor array 230. Spot mirror 223 servesconveniently as a zero order stop at this point, so that the informationcarrying beam 226 comprises only higher order diffracted radiation fromthe mark 202 (this is not essential to the measurement, but improvessignal to noise ratios). Intensity signals 232 from one or moreindividual sensors in sensor grid 230 are provided to a processing unitPU. By a combination of the optical processing in the block 228 and thecomputational processing in the unit PU, values for X- and Y-position ofthe substrate relative to the reference frame RF are output. Processingunit PU may be separate from the control unit LACU shown in FIG. 1, orthey may share the same processing hardware, as a matter of designchoice and convenience. Where unit PU is separate, part of the signalprocessing may be performed in the unit PU and another part in unitLACU.

As mentioned already, a single measurement of the type illustrated fixesthe position of the mark within a certain range corresponding to onepitch of the mark. Coarser measurement techniques are used inconjunction with this to identify which period of the sine wave is theone containing the marked position. The same process at coarser and/orfiner levels can be repeated at different wavelengths for increasedaccuracy, and for robust detection of the mark irrespective of thematerials from which the mark is made, and in, on and/or below which itsits. The wavelengths can be multiplexed and demultiplexed optically soas to be processed simultaneously, and/or they may be multiplexed bytime division or frequency division. Examples in the present disclosurewill exploit measurement at several wavelengths to provide a practicaland robust measurement apparatus (alignment sensor) with reducedsensitivity to mark asymmetry.

Referring to the measurement process in more detail, an arrow labeledv_(W) in FIG. 3 illustrates a scanning velocity with which spot 206traverses the length L of mark 202. In this example, the alignmentsensor AS and spot 206 in reality remain substantially stationary, whileit is the mark 202 that moves with velocity v_(W). The alignment sensorcan thus be mounted rigidly and accurately to the reference frame RF(FIG. 1), while effectively scanning the mark 202 in a directionopposite to the direction of movement of mark 202. The mark 202 in thisexample is controlled in this movement by its location on substrate Wwhich is mounted on the substrate table WT and the substrate positioningsystem PW. All movements shown are substantially parallel to the X axis.Similar actions apply for scanning the mark 204 with spot 208 in the Ydirection. This will not be described further.

As discussed in U.S. patent application publication no. US 2012-0212749,incorporated by reference herein in its entirety, high productivityrequirements of the lithographic apparatus means that measurement of thealignment marks at numerous positions on the substrate should beperformed as quickly as possible, which implies that the scanningvelocity v_(W) is fast, and the time T_(ACQ) available for acquisitionof each mark position is correspondingly short. In simplistic terms, theformula T_(ACQ)=L/v_(W) applies. US 2012-0212749 describes a techniqueto impart an opposite scanning motion of the spot, so as to lengthen theacquisition time. The same scanning spot technique can be applied in asensor and method of the type newly disclosed herein, if desired.

There is interest in aligning on marks with smaller grating pitches. Themeasured overlay in real production can be generally significantlylarger than under controlled test conditions. This may be due to thealignment marks on product substrates becoming asymmetric to varyingdegrees during processing. Reducing the pitch of the alignment marksdecreases the effect of some types of asymmetry on the measuredalignment position.

Some options to allow reduction of the pitch of an alignment gratinginclude (i) shortening the wavelength of radiation used, (ii) increasingthe numerical aperture (NA) of the alignment sensor optics and/or (iii)using off-axis illumination. A shorter wavelength is not always possiblesince alignment gratings are often located underneath an absorbing film(for example an amorphous carbon hard mask). Increasing the NA is ingeneral possible but may not be preferred since there is a desire for acompact objective with a safe distance from the substrate. Thereforeusing off-axis illumination is attractive.

Position Measurement with Off-Axis Illumination

FIG. 4 illustrates an optical system 400 of an alignment sensor that isa modified version of one described in U.S. Pat. No. 6,961,116 and US2009/0195768 mentioned above. This introduces the option of off-axisillumination modes which, among other things, allow a reduced pitch ofalignment mark for greater accuracy. The optical system may also allowscatterometry type measurements to be performed with the alignmentsensor, rather than with a separate scatterometer instrument. In FIG. 4,for simplicity, the details of multiple wavelengths and polarizationsare omitted. More detail of these aspects of the optical system will bedescribed with reference to FIG. 9.

An optical axis O which has several branches is indicated by a brokenline running throughout the optical system 400. For ease of comparisonwith the schematic diagram of FIG. 3, some parts of the optical system400 are labeled with reference signs similar to those used in FIG. 3,but with prefix “4” instead of “2”. Thus, there is a radiation source420, an illumination beam 422, an objective lens 424, an informationcarrying beam 426, a self-referencing interferometer 428 and a detector430. In practice, multiple detectors may be provided, which willdescribed in a little more detail below, with reference to FIG. 9.Signals from the detector is processed by processing unit PU, which ismodified so as to implement the features described below and to outputan (improved) position measurement POS for each mark.

Additional components illustrated in this more detailed schematicdiagram are as follows. In an illumination subsystem 440, radiation fromsource 420 is delivered via an optical fiber 442 to an illuminationprofiling optic 446. This delivers input beam 422 via beam splitter 454to objective lens 424 having a pupil plane P. Objective lens 424 forms aspot 406 on alignment mark 202/204/210. Information-carrying beam 426,diffracted by the mark, passes through beam splitter 454 tointerferometer 428. Interferometer 428 splits the radiation field intotwo parts with orthogonal polarization, rotates these parts about theoptical axis by 180° relative to one another, and combines them into anoutgoing beam 482. A lens 484 focuses the entire field onto a detector430, which is an arrangement similar to the alignment sensor of FIG. 3.The detector 430 in this example and in the alignment sensor areeffectively single photodiodes and do not provide any spatialinformation except by the scanning motion described already. A detectorhaving spatial resolution in a conjugate pupil plane can be added, toallow an angle-resolved scatterometry method to be performed using thealignment sensor hardware.

Included in the present example is an asymmetry measuring arrangement460. Arrangement 460 receives a part 464 of the information carryingbeam 426 through a second beam splitter 462 positioned in advance of theinterferometer 428. In the present disclosure, a novel technique for themeasurement of asymmetry using position information obtained through thedetector 430 is described. In principle, a dedicated asymmetry measuringarrangement 460 could be eliminated. However, in the particularembodiments described herein, the techniques are used to obtainadditional information on asymmetry, that can be combined with theresults of dedicated asymmetry measuring arrangement 460. This allowsthe apparatus user to improve further the accuracy of asymmetryinformation available, and thereby to enable a more accurate and/or moremeasurement of position.

Illumination profiling optic 446 can take various forms, some of whichare disclosed in more detail in US patent application no. US 61/623,391,filed Apr. 12, 2012, the contents of which is incorporated herein itsentirety by reference. In the examples disclosed therein, an alignmentsensor (more generally, a position measuring apparatus) is shown whichmay allow the use of a reduced grating pitch without the need forspatial resolution on the detector side. By use of one or more novelillumination modes, the apparatus may be able to measure the position ofa mark with a wide range of different pitches, for example from lessthan 1 nm to about 20 microns, without changing the current detectordesign. A particular feature common to the examples described in US61/623,391 mentioned above, is the option to use off-axis illuminationat a limited range of incidence angles (limited radial extent in thepupil plane). By off-axis illumination, it is meant that source regionsof radiation are confined to a peripheral portion of the pupil, that isto say, some distance away from the optical axis. Confining theillumination to an extreme periphery of the pupil reduces the smallestpossible pitch of the alignment mark from substantially λ/NA tosubstantially λ/2NA, where λ, is the wavelength of radiation used, andNA is the numerical aperture of an objective lens of the instrument(e.g. the alignment sensor or more generally the position measuringapparatus). The examples described in US 61/623,391 also use aparticular distribution of spot mirrors in a beam splitter of theapparatus, which can both provide the desired illumination and act as afield stop for zero order diffracted radiation. A ‘universal’illumination profile can be designed that allows for aligning on any ofthe X, Y and XY marks without changing the illumination mode, althoughthis inevitably brings some compromise in performance and/or somecomplication in the apparatus. Alternatively, dedicated modes can bedesigned and made to be selectable for use with the different marktypes. Different polarizations of illumination can be selected also.

A primary function of the illumination profiling optic 446 is such tosupply coherent radiation from first and second source regions within apupil of the objective lens 424. The first and second regions areconfined to a peripheral portion of the pupil (in the sense of at leastbeing away from the optical axis). They are each limited in angularextent and are positioned essentially diametrically opposite one anotherwith respect to the optical axis. As will be seen from the examples inUS 61/623,391, the source regions may take the form of very small spots,or may be more extended in form. Further source regions may be provided,in particular third and fourth source regions may be provided rotated atabout 90° from the first and second regions. A particular embodiment ofillumination profiling optics 446 comprises a self-referencinginterferometer of the same general form as interferometer 428. Theapparatus as a whole need not be limited to providing these particularoff-axis illumination profiles. It may have other modes of use, bothknown or yet to be developed, which favor the use of different profiles.A particular alternative profile, included in discussions below, is onehaving a single, on-axis region.

It should be noted that in the example shown in FIG. 4 some polarizingelements used in practice around the interferometer have been omitted.This is only done to simplify the explanation of this idea. In a realimplementation they may need to be included. Additionally, measurementsmay be with different polarizations according to the mark type, and/orto make measurements with more than one polarization on each mark. Thefeatures to achieve desired polarizations can readily be envisaged bythe skilled person. Some more detail will be given below with referenceto FIG. 9.

Referring to FIGS. 5 and 6, selection of on- and off-axis illuminationmodes for the different mark types shown in FIGS. 2( a) and (b) aredepicted. An example that will be accommodated in the examples below isan on-axis illumination profile, for compatibility with existing marksand measurement methods. Referring firstly to the example of an on-axismode, as used in the sensor of FIG. 3, illumination normal to the markis provided by an on-axis illumination profile 448(O) having a centralbright spot within an otherwise dark pupil 452, as seen in FIG. 5( a).This profile is an optional setting in the illumination beam 422 of theapparatus. In this example, it is desired for the zero order beam whichreturns along the optical axis to blocked before entry to interferometer428, but also for it to be transferred to the asymmetry measuringarrangement 460 (when provided). To block the zero order before theinterferometer is not essential, but improves the signal to noise ratioof the position signal. Accordingly, in this embodiment, a spot mirror470 is included in the second beam splitter 462. The first splitter 454is not silvered, and one accepts that only 50% or so of the intensity ofthe central spot may be transferred to the mark. In an alternativeembodiment, where the arrangement 460 is omitted, this profile may beproduced directly by illumination profiler 446 and transmitted at fullintensity to objective 424 by a spot mirror within the first beamsplitter 454. A variety of alternatives can be envisaged to obtain adesired profile.

The horizontal dotted line in FIGS. 5-8 represents the direction ofperiodicity of a mark being read, in this case an X direction mark. Asseen in FIG. 5( b), diffraction spots of −1 and +1 order occurring indirection X will fall safely within the pupil of the optical system, solong as the grating pitch is λ/NA or less. The same is true for thecases of Y and XY marks (not illustrated). In general, an integer n mayrepresent any diffraction order above zero. An alignment signal can beextracted when the +n order overlaps with the n order. This is doneusing the self-referencing interferometer 428 that mixes +90° and −90°rotated copies of the incoming radiation field, giving the profile482(O) seen at FIG. 5( c).

When off-axis illumination is used, bright spots of coherent radiationcan be produced at peripheral positions, also illustrated in FIG. 4. Thespots in the profile 448 are in two pairs, with 180° symmetry in eachpair. The pairs are at 90° to one another, and located at 22.5° to the Xand Y axes. The spots have a limited radial extent and a limited angularextent in the pupil plane. By providing such a pattern of spots, allthree grating directions are be supported, either in a singleillumination mode, or by modes which are readily selectable in thehardware. US 61/623,391 discloses various methods of producing suchprofiles, including by spot mirrors and by use of a self-referencinginterferometer of the same form as interferometer 428. As discussedalready in the context of the on-axis illumination these spots could bematched by spot mirrors in beam splitter 454 so as to form the desiredillumination profile 448 at the pupil plane P of objective lens 424without wasting of radiation. In this embodiment, however, the spotmirrors 472 are placed instead in the splitter 462 as shown, so thatthey can deliver zero order diffracted beams to the asymmetry measuringarrangement 460.

The spots and spot mirrors are likely to be much smaller in practicethan the large spots illustrated schematically here. For example, for apupil diameter of a few centimeters, the spot size may be less than 1millimeter. The optical system as shown is only presented for thediscussion of an embodiment of the present invention, and additionalcomponents can be added in a practical implementation. As one example,one or more additional beam splitters can be provided in the path ofinformation carrying beam 426, to collect portions of the radiation forother purposes. For example, another splitter with part-silvered spotmirrors could be placed between splitters 454 and 462, to collect someradiation for measurement of intensity. Alternatively or in addition,portions of the radiation can be collected in the arrangement 460 forsimilar purposes.

FIG. 6 shows (a) an off-axis illumination profile 448, (b) a diffractionpattern in the information carrying beam 426 and (c) an interferometeroutput 482 for an X-direction mark having almost half the pitch of themark used in FIG. 5, where a suitable pair of spots of the availableillumination spots are chosen to be illuminated. In this instance,despite the reduced pitch and consequently greater angle, the orders +1and −1 fall within the pupil, sufficient for recognition of the markposition, and represents a lower limit for the grating pitch that issubstantially λ/2NA, i.e. half what applied in the known instrument. Thecircle in each diagram again represents the pupil of the optical system,while the direction of periodicity in the mark is represented by thedotted line crossing the circle. In FIG. 6( a), two spots ofillumination are positioned diametrically opposite one another,providing the illumination profile with 180° symmetry about the opticalaxis (O, not shown). (The skilled reader will understand that thesespots exist in the pupil plane and are not to be confused with the spoton the mark itself, or in an image of the mark. On the other hand, 180°in the pupil plane is equivalent to 180° rotation in the image planealso.) The spots are not positioned on the X axis (dotted line), butrather offset from it by a small angle, in this example 22.5°.Consequently, the spots are offset from one another in a directiontransverse to the X axis, that is to say, transverse to the direction ofperiodicity of the grating. At FIG. 6( b), the resulting diffractionpattern caused by the grating of the alignment mark 202 is depicted. Forone spot, the diffraction order +1 is within the pupil. For the otherspot, the diffraction order −1 is within the pupil, at a position 180°rotated from the order +1. A zeroth order diffraction (specularreflection) of each spot coincides exactly with the position of theother spot.

If the pitch of the grating were to increase, additional orders −2 and+2 etc. may fall within the pupil. Because of the offset mentionedalready, the diffraction orders of each spot remain separate from thoseof the other spot, irrespective of the pitch of the grating. Anapparatus can be envisaged in which the offset is not present, and theillumination spots lie exactly on the X, Y and/or XY axes. However suchan arrangement may place constraints on the combinations of mark pitchesand radiation wavelengths that can be used, if one is to avoid unwantedoverlap between diffraction orders, and to avoid wanted diffractionorders being blocked. In an embodiment where broadband or polychromaticradiation is used, the higher order diffraction signals will not be asingle spot, as shown here, but rather will be spread into a first orderspectrum, second order spectrum and so forth. The potential for unwantedoverlap between orders is thereby greater. The orders will berepresented as spots here for simplicity only.

FIG. 6( c) shows the result of passing the diffraction signal at FIG. 6(b) through interferometer 428 that mixes +90° and −90°-rotated copies ofthe mark image. It is assumed that the 0th order spots are blocked by afield stop at some point prior to the interferometer. A simpleimplementation of such a field stop would be the spot mirrors 472, whereprovided. The positive and negative signals for each higher order aresuperimposed and become mixed as indicated by +1/−1, +2/−2, etc.Provided that the original illumination spots are coherent with oneanother, the effect is the same as the mixing of positive and negativeorders of a single illumination spot. Accordingly the interferometer,detection optics and detection electronics of the position measuringapparatus can be the same as in the apparatus of FIG. 3. The processingof detected signals to obtain a position measurement can besubstantially the same also.

The directions in which the higher order spots will be found in thediffracted radiation field are indicated for the X, Y and XY marks bywhite dotted lines on the profiles 448 and 448(0) as illustrated in FIG.4. The illumination profile 448 in each mode has the properties: (i)each spot is limited in radial and angular extent and (ii) within eachspot pair the spots are offset from one another in a directiontransverse to any of the directions of periodicity of the X, Y or XYmarks. Accordingly, higher order spots lying along these diffractiondirections will not interfere with one another, at least in the middlepart of the field. Adjustable field stop 490 can be provided to reducethe risk of overlap further, particularly where coarse marks are beingmeasured. More detail of this is contained in US 61/623,391 mentionedabove.

The prior application further illustrates the diffraction patterns andinterferometer outputs for illumination modes designed for a Y-directionmark (204 in FIG. 2( a)) and for an XY mark (210 in FIG. 2( b)).Everything that has been said above with respect to parts (a), (b) and(c) of FIGS. 5 and 6 applies equally to these parts. Because the XY markhas portions with different orientations of grating lines, each at 45°to the X and Y axes, two pairs of spots are provided in the illuminationprofile. As in the X and Y cases, the spots of each pair are positioneddiametrically opposite one another, and slightly offset from one anotherin a direction transverse to the direction of periodicity of thegrating. Note that the two pairs of spots do not need to be present atthe same time when scanning the XY mark: each pair can be switched onfor scanning the portion of the mark that has the correspondingdirection of periodicity. If both pairs of spots are illuminated all thetime while scanning the XY mark, then the diffraction orders received bythe objective from the substrate will be only those corresponding to thedirection of periodicity in the particular part of the mark beingscanned for suitably small pitches.

The illumination profiles can be produced in a number of ways to form apractical instrument, bearing in mind that the opposed segments shouldbe coherent for the interferometer 428 to produce the desired signal.Particularly when a broadband source is involved, the coherencelength/time of the source radiation will be short. Even with amonochromatic laser source, U.S. Pat. No. 6,961,116 teaches that a shortcoherence time is desired, for example to eliminate interference fromundesired multiple reflections. Consequently, optical path lengths fromthe source to each segment should be closely matched. An aperturecorresponding directly to the desired profile could be placed in awidened parallel beam, but that would result in a relatively largeradiation loss. To circumvent the loss of radiation, various alternativesolutions in the US 61/623,391 mentioned above are proposed.

The illumination emerging from the illumination source 442 may bemonochromatic but is typically broadband in nature, for example whitelight, or polychromatic. A diversity of wavelengths in the beamincreases the robustness of the measurement. The sensor may use, forexample, a set of four wavelengths named green, red, near infrared andfar infrared. In a sensor implementing an embodiment of the presentinvention, the same four wavelengths could be used, or a different four,or more or fewer than four wavelengths might be used.

The mark may need to be scanned more than once if it is desired forexample to measure position using two different polarizations. Also theillumination mode may be switched midway through scanning the XY mark.In other embodiments, however, multiplexing of optical signals is usedso that two measurements can be made simultaneously. Similarly,multiplexing can be applied so that different portions of the XY markcan be scanned and measured without switching illumination mode. Asimple way to perform such multiplexing is by frequency divisionmultiplexing. In this technique, radiation from each pair of spotsand/or polarization is modulated with a characteristic frequency,selected to be much higher than the frequency of the time-varying signalthat carries the position information. The diffracted and processedoptical signals arriving at detector 430 will be a mixture of twosignals, but they can be separated electronically using one or morefilters tuned to the respective frequencies of the source radiation.Time division multiplexing could also be used, but this would involveaccurate synchronization between source and detector. The modulation ateach frequency can be a simple sine or square wave, for example.

If it is desired to illuminate a mark with circular polarization,whether for position sensing or some other form of metrology, a quarterwave plate (not shown) can be inserted between beam splitter 454 andobjective 424. This has the effect of turning a linear polarization intoa circular one (and changing it back again after diffraction by themark). The spot positions are chosen as before according to the markdirection. The direction of circular polarization(clockwise/counterclockwise) can be changed by selecting a differentlinear polarization in the illumination source 420, fiber 422 orillumination profiling optic 446.

Referring briefly to FIGS. 7 and 8, these show (b) diffraction patternsand (c) interferometer outputs for the same illumination profiles (a) aswere shown in FIGS. 5 and 6. The difference is that in FIGS. 7 and 8 itis assumed that the illumination contains a number of differentwavelengths. As mentioned already, the alignment sensor may use, forexample, a set of four wavelengths named green, red, near infrared andfar infrared. These provide robust position readout from a range ofmarks, which may have to be read through overlying layers of differentmaterials, different material properties and/or different thicknesses.Whereas the first order signals for monochromatic light appear as singlespots in FIGS. 5 and 6, FIGS. 7 and 8 depict how the differentwavelengths present in the illumination of the alignment sensor arespread into spectra. Where the illumination comprises several discretewavelengths, and where the spots in practice are very much smaller thanthe spots illustrated here, the diffracted spots for the differentcolors will not necessarily overlap in the way that they are shown inFIGS. 7 and 8. They could be separated by providing an image sensor in aconjugate pupil plane, or by measuring the different colorssequentially, like in a scatterometer. However, an image sensor may bemore prone to noise, each spot may cover only one pixel or less, and animage sensor may bring noise and heat dissipation that should be avoidedif possible in the alignment sensing environment. Note that in thecoarser pitch mark used in FIG. 8, the diffracted orders aresignificantly closer to the zero order spot at the center. In theoff-axis illumination mode with a mark of finer pitch, the first ordersfor the different colors are more spread out, and further from the zeroorders.

While the examples described herein concentrate on 0^(th) order and+/−1^(st) order diffraction signals, it will be understood that thedisclosure extends to the capture and analysis of higher orders, forexample +/−2^(nd) orders, more generally +/−n^(th) orders. In theexamples, the 1^(st) orders only are shown and discussed, forsimplicity.

FIG. 9 illustrates in more detail aspects of the apparatus of FIG. 4,concerned with measurement using multiple wavelengths of radiation, andconcerned with the management of polarization effects. The samereference numbers are used for components seen in FIG. 4, while some ofthose components are seen here with details not seen in FIG. 4. Inillumination subsystem 440, source 420 comprises, for example, fourindividual sources to provide radiation with four wavelengths namedgreen (labeled G), red (R), near infrared (N) and far infrared (F). Forconvenience in the following discussion, the radiation at these fourdifferent wavelengths will be called four colors of light, it beingimmaterial for present purposes whether they are in the visible ornon-visible parts of the electromagnetic spectrum. All the sources arelinearly polarized, with the G and N radiation being oriented the samedirection as one another, and the R and F radiation being polarizedorthogonally to the G and N polarization.

The four colors are transported by polarization maintaining fiber to amultiplexer 502, where they are combined into a single four-color beam.The multiplexer maintains linear polarization, as indicated by arrows504. The arrows 504 and similar arrows throughout the diagram arelabeled G and R to indicate polarization of the green and redcomponents. The N and F components are oriented the same as the G and Rcomponents, respectively.

This combined beam goes via suitable delivery optic 506 into beamsplitter 454. As already described, the beam then reflects from apartially- or fully reflecting surface (e.g. a 0.5 mm dia spot mirror),which is inside the beam splitter. The objective lens 424 focuses thebeam to a narrow beam which is reflected and diffracted by the gratingformed by alignment mark 202 on the substrate. Radiation is collected bythe objective, with for example numerical aperture NA=0.6. This NA valuemay allow at least ten orders of diffraction to be collected from agrating with 16 μm pitch, for each of the colors.

The reflected and diffracted radiation forming information carrying beam426 is then transported to the self-referencing interferometer 428. Inthis example, as already described, the beam is split 462 to supply aportion 464 of the information carrying beam to the asymmetry measuringarrangement 460, when provided. Signals 466 conveying asymmetrymeasurement information are passed from arrangement 460 to theprocessing unit PU. Just before the interferometer, polarization isrotated by 45° by a half wave plate 510. From this point on,polarization arrows are shown for only one color, for clarity. Theinterferometer, as already described above and in U.S. Pat. No.6,961,116, comprises a polarizing beam splitter, where half of eachcolor is transmitted, and half of each color is reflected. Each halfthen is reflected three times inside the interferometer, rotating theradiation field by +90° and −90°, giving a relative rotation of 180°.The two fields are then superimposed on top of each other and allowed tointerfere. A phase compensator 512 is present to compensate for pathdifferences of the −90° and 90° image. The polarization is then rotated45° by another half wave plate 514 (having its major axis set at 22.5°to the X or Y axis). The half wave plates 510, 514 are substantiallywavelength insensitive, so that polarizations of all four wavelengthsare rotated by 45°.

A further beam splitter 516 (not shown in FIG. 4) splits the opticalsignal into two paths designated A and B. One path contains the sum ofthe two rotated fields, and the other contains the difference. Dependingon the initial polarization direction, the sum ends up in path A or pathB. So in this example the sum signals for G and N end up in one path,and R and F in the other. For each color, the corresponding differencesignal ends up in the other path.

Note that this arrangement chooses to use one polarization forillumination in each color. Measurements with two polarizations percolor could be made, by changing the polarization between readings (orby time division multiplexing within a reading). However, to maintainhigh throughput while benefiting from some diversity in color andpolarization, a set of different colors with single, but different,polarizations represents a good compromise between diversity andmeasurement throughput. To increase diversity without impactingthroughput, one can envisage an implementation similar to the four-colorscheme presented here, but using more colors, for example eight orsixteen, with mixed polarizations.

The radiation for each path A and B is collected by a respectivecollector lens assembly 484A and 484B. It then goes through an aperture518A or 518B that eliminates most of the radiation from outside the spoton the substrate. Multimode fiber 520A and 520B transports the collectedradiation of each path to a respective demultiplexer 522A and 522B. Thedemultiplexer splits each path in the original four colors, so that atotal of eight optical signals are delivered to detectors 430A and 430B.In one practical embodiment, fiber goes from the demultiplexer to eightdetector elements on a detector circuit board. The detectors provide nospatial resolution, but deliver time-varying intensity signals I_(A) andI_(B) for each color, as the apparatus scans the mark 202. The signalsare actually position-dependent signals, but received as time-varyingsignals (waveforms) synchronized with the physical scanning movementbetween the apparatus and the mark (recall FIG. 3).

Processing unit PU receives the intensity waveforms from the eightdetectors and processes them to provide a position measurement POS.Because there are eight signals to choose from, based on differentwavelengths and incident polarizations, the apparatus can obtain useablemeasurements in a wide variety of situations. In this regard it shouldbe remembered that the mark 202 may be buried under a number of layersof different materials and structures. Some wavelengths will penetratedifferent materials and structures better than others. Processing unitPU conventionally processes the waveforms and provides a positionmeasurement based on the one which is providing the strongest positionsignal. The remaining waveforms may be disregarded. In a simpleimplementation, the ‘recipe’ for each measurement task may specify whichsignal to use, based on advance knowledge of the target structure, andexperimental investigations. In more advanced systems, for example asdescribed in the paper by Huijbregtse et al. mentioned above, anautomatic selection can be made, using “Color Dynamic” or “Smooth ColorDynamic” algorithms to identify the one or more best signals withoutprior knowledge.

‘Discarded’ waveforms, when considered together as a set, may containuseful information about the structure and materials. In particular,they may contain information about asymmetry of the structure, whichwill be exploited to provide an alternative or additional asymmetrymeasurement technique as described further below. In addition, the setof signals can contain other information on the ‘stack’, that is thesequence of layers lying on top of the mark, and possibly beneath it aswell. It will be appreciated that by using more of the informationpresent in these existing signals, the proposed technique makes moreefficient use of the total amount of photons reflected and diffracted bythe substrate.

Also described in the Huijbregtse et al. paper is the use of multiplegratings in a composite target. Each grating has a different profile,enhancing for example higher diffraction orders (3^(rd), 5^(th),7^(th)). Position measurements can be derived from different ones ofthese gratings, as well as from different color signals on an individualgrating. In the present disclosure, it will be assumed that there is asingle grating with a simple bar pattern. The skilled reader can readilyexpand the disclosure to envisage embodiments having multiple gratingswith different patterns.

Asymmetry Measurement—Introduction

As described so far, the position measurement apparatus is used forexample to obtain an alignment position in a lithographic apparatus suchas that shown in FIG. 1. An error may be made when the alignment mark isasymmetric. The alignment error caused by an asymmetric alignment markcan contribute to the overlay error in devices made using themeasurements in operation of the lithographic apparatus. By adding anasymmetry detection function to the position measurement apparatus,asymmetry of the mark can be measured using much of the same hardware asthe position measurement, and simultaneously with the positionmeasurement if desired. This measurement raises the possibility tocorrect the alignment error caused by asymmetry, during alignment of thelithographic apparatus. The following are some techniques that may beused in combination with an embodiment of the invention.

Metrology tools are available commercially to measure asymmetry.However, these may neither be integrated with the alignment sensor normay they be fast enough to operate with the alignment sensor withoutharming throughput of a lithographic process. One such apparatus is anangle-resolved scatterometer that uses a CCD-array in a conjugate pupilplane to measure the intensity asymmetry in a diffraction spectrum. Thescatterometer measures asymmetry sequentially for a number of colors. Inthe alignment sensor, the positions signals from different colors may bemeasured in parallel for speed. Additionally, speed, noise and power(heat) dissipation may present challenges to the asymmetry measuringarrangement, if it is to be integrated in an alignment senor.

Several different approaches are possible for adding an asymmetrymeasuring function to the position measuring apparatus. As mentionedalready, an asymmetry measuring arrangement 460 may be included in theapparatus, which processes a portion 464 of the information carryingbeam 426 diverted by beam splitter 462. The form of the asymmetrymeasuring apparatus 460 can vary.

In US 61/623,391 mentioned above, there is mentioned an asymmetrymeasuring arrangement that includes a camera to capture pupil planeimages of the diffracted radiation. These images can be used forangle-resolved scatterometry. By comparing intensities of image portionscorresponding to positive and negative orders of diffraction, asymmetrycan be measured. The option to add such a pupil image camera as anasymmetry measuring arrangement in an alignment sensor is discussed US61/623,391. US 61/623,391 mentions another technique for measuringasymmetry through the interferometer and detector 430. This usesillumination profiles in which off-axis illumination is provided fromone side only at a time, allowing the apparatus to measure the intensityof the +1 order and −1 order separately from one another.

In US patent application no. US 61/684,006, filed 16 Aug. 2012, thecontents of which is incorporated herein in its entirety by reference, afurther form of asymmetry measuring arrangement 460 is proposed. In thisform of arrangement, the illumination spot on the substrate is imagedonto a detector. Special optical elements are included in the opticalpath prior to imaging, which deflect positive and negative diffractionorders so that radiation of the different diffraction orders isseparated and used to image the spot onto separate detectors.

Any of the arrangements just mentioned can be used to implement anasymmetry measuring arrangement 460 in the present apparatus. Thefollowing description concerns a further technique for measuringasymmetry, using the existing position measuring hardware. Thistechnique may be used instead of or in combination with the arrangement460, which may take either (or both) of the forms described in thementioned prior applications, or may take another form entirely.

Asymmetry Measurement from Position Signals

FIG. 10 illustrates a method of measuring position of a mark whichincludes a method of measuring asymmetry based on position information.Alternative methods are possible within the scope of the presentdisclosure. In particular, the steps of this method can be implementedin combined forms, and do not necessarily need to be performedseparately and sequentially, as presented here.

In step S1, the mark is scanned as described above, and multiplewaveforms are recorded, according to the different colors, polarizationsand/or the like that are accessible in the optical system. Referring tothe example of FIG. 9, eight waveforms will be obtained per mark,corresponding to four color/polarization combinations and twocomplementary waveforms per color (sum and difference signals frombranches A and B). Different implementations can yield differentwaveforms. In step S2, multiple position measurements are obtained fromone or more of the waveforms in a manner which can be, for the sake ofthis example, a conventional manner. Within each color waveform thereare also multiple position signals, based on different periodiccomponents (harmonic orders), so that a great number of differentcandidate position measurements are actually available in practice(several tens). At this point, it would be possible to obtain a singleposition measurement by judging which of the position signals containsthe strongest position-dependent variation and/or the best signal tonoise ratio. In the embodiment described here, the selection orcalculation of a single position measurement will be deferred untilafter all the candidate measurements have been corrected for asymmetry(see step S6).

In step S3 asymmetry information is obtained from the asymmetrymeasuring arrangement 460 (asymmetry sensor for short). Asymmetryinformation may be obtained alternatively or additionally from somesource external to the position measuring apparatus.

In step S4, rather than discarding additional information from themultiple position signals derived from the waveforms captured in stepS1, information from multiple signals is used to obtain a refinedmeasurement of asymmetry or an asymmetry-dependent parameter. The mannerin which this is done can vary, and examples will be explained furtherbelow. Increasing the measured information used can be beneficial tohelp ‘break’ unknown correlations between measured alignment positionand various parameters of the target grating parameter. It can alsoincrease the total number of photons used, and hence the signal to noiseratio will be improved.

At step S5, the refined asymmetry measurement derived in step S4 is usedto apply a correction to each of the positions measured in step S2. Instep S6, a “best” measured position is calculated by selecting and/orcombining results from among the multiple corrected positionmeasurements. This measurement, which has improved accuracy due toreduced asymmetry sensitivity, is output S7 either for use in alithographic process, or more generally as a metrology result.

FIG. 11 gives more detail of step S4 in which asymmetry measurement isobtained from the multiple waveforms collected by the position sensingapparatus. Steps within step S4 will be numbered S41, etc. As alreadymentioned, some of the processing can be performed together withprocessing for steps S2 and S3, and need not be separated and performedsequentially. Similarly, processing for the sub-steps S41, etc. does notneed to be performed separately and sequentially in the manner impliedby the flowchart. The flowchart is merely present as an aid todescription of the overall method of one exemplary embodiment.

In step S41 the waveforms (position-dependent intensity signals) fromthe eight elements (in this example) of detectors 430A, 430B arereceived. At step S42, each waveform is decomposed into separatecomponents. For example, a discrete Fourier transform (DFT) may be usedto decompose the waveform into a set of component waveforms that areessentially harmonics of the period of the grating forming the mark 202.If the waveform were purely sinusoidal with period P/2, then only afirst order component would have any magnitude. In a real target and areal instrument, however, several odd and even harmonics may be presentin different phases and amplitudes. As described in the Huijbregtse etal. paper mentioned above, different target gratings may even bedesigned specifically to introduce strong higher-order signals. Thesemultiple orders will be exploited to learn more about the structure ofthe target (including overlying stack layers). The result of step S42 isthus a set of numerous components, of different orders, but also ofdifferent colors/polarization combinations. Each of these components inprinciple can yield a position measurement. Therefore taking for examplefive orders for each of the eight waveforms will yield 40 differentposition measurements.

In step S43 a position measurement is calculated from each of themultiple components (color/polarization and order), which in practice isa matter of sharing the results already calculated in step S2 (FIG. 10).In step S44 a variance (noise estimate) is calculated for each positionmeasurement, based on understanding of the shape of the grating and theobserved signals. Besides position information, the component signalsderived from the waveforms may contain other information that can beused to refine a model of the target structure. In the example describedhere, step S43 relies on a phase characteristic of the waveformcomponents to calculate position. By contrast, in step S45 the processoranalyzes intensity information of the different components, to obtainadditional information that can be used to refine the target model. Instep S46 the processor calculates variance of the intensity information.

In step S47, the numerous different position measurements are combinedwith a model of the target structure (mark 202) to identify best fittingparameters of that model. In particular, for the purposes of asymmetrymeasurement, an asymmetry-dependent parameter is included in the model.The variance calculated in step S44 is used as a measure of the qualityof the corresponding position measurement obtained in step S43.Similarly, the variance calculated in step S46 is used as a measure ofthe quality of the corresponding intensity-based measure obtained instep S45. All of these results in turn are weighed against the asymmetrymeasurement per color/polarization and order that was obtained from theasymmetry sensor to obtain a single “best” measurement of asymmetry,which is then output at step S48.

Exemplary Implementation

A particular implementation of the above method steps will now beillustrated in mathematical detail. It should be understood that theabove method steps are not the only way to implement asymmetrymeasurement and position measurement in accordance with an embodiment ofthe present invention. Moreover the mathematical detail below is not theonly way to implement the above method steps in practice.

To facilitate the description and implementation of the asymmetrymeasurement technique, an alignment sensor model is defined and will beused throughout this document as an example. For convenience the samecoordinate systems will be used as for the basic operation of theposition measuring apparatus and associated asymmetry measuringarrangement 460. Firstly, the spatial coordinate system at target(substrate) level and the polarization coordinate level at pupil levelare defined. Let {circumflex over (x)} _(P), ŷ _(P) and {circumflex over(z)} _(P) denote the pupil spatial Cartesian coordinate system unitvectors. Let {circumflex over (x)} _(P), ŷ _(P) and {circumflex over(z)} _(P) denote the target spatial Cartesian coordinate system unitvectors. Note that the intensity detectors 430A, 430B are located inplanes conjugate to the target plane. Depending on the design ofasymmetry measuring arrangement 460, intensity detectors there may be ina pupil plane or a target plane.

FIG. 12 illustrates some notation of coordinate systems geometry. BothCartesian and spherical coordinate systems may be useful at differentpoints in the model and calculations, and suitable transformations willbe defined for converting data between the coordinate systems. Notation(θ, φ) denotes a coordinate on a spherical coordinate system (i.e. acoordinate on the unit half sphere). Note that alternative notation(θ,φ) will also be used for reasons which will become clearer later.Notation (f, g) denotes a coordinate on a polar coordinate system (i.e.a coordinate on the unit disk). Note that the {circumflex over (z)} axisis either facing upwards or downwards as specified later in more detail.

FIG. 13 illustrates notation in a pupil coordinate system, when viewingin the direction of −1·{circumflex over (z)} _(P) (i.e the negativespatial coordinate z unit vector). FIG. 14 illustrates notation in atarget coordinate system, when viewing in the direction of {circumflexover (z)} _(r) (i.e. the positive spatial coordinate z unit vector). Inaddition to spatial coordinate systems, there are polarizationcoordinate systems, as the skilled person will understand. Forsimplicity throughout this description it is assumed that the targetgrating (alignment mark 202) is a one-dimensional periodic grating andis periodic with period (pitch) P in the {circumflex over (x)} _(T)direction only (typical values of pitch for this application are 500nm≦P≦20000 nm). The geometry and calculations can be adapted by theskilled person for measuring the Y mark and XY marks 206, 210.

With reference to the notation introduced in FIGS. 12 to 18, thefollowing terms are defined:

$\underset{\_}{\eta_{T}} = \begin{bmatrix}{- \frac{\lambda_{0}}{P}} \\0\end{bmatrix}$

-   -   denotes the grating vector. Note that the length of the grating        vector η _(T) is defined in unit [sine angle] (which is a direct        consequence of the grating equation). Note that, for the        purposes of the present description, the grating vector always        points in the direction of −1·{circumflex over (x)} _(T).    -   (f′_(T), g′_(T)) denotes the incident ray (i.e. a plane wave)        pupil coordinate in the target spatial Cartesian coordinate        system (in unit [sine angle]).    -   (f″_(T), g″_(T))₀ and (f″_(T), g″_(T))⁻¹ denote respectively the        reflected (zeroth order) and minus first order diffracted ray        (i.e. a plane wave) pupil coordinate, in the target spatial        Cartesian coordinate system (in unit [sine angle]).    -   (θ′_(T),φ″_(T)) denotes the position of the incident ray in the        target spatial spherical coordinate system.    -   (θ″_(T),φ″_(T)) denotes the position of the reflected/diffracted        ray in the target spatial spherical coordinate system.    -   ŝ and {circumflex over (p)} _(T), denote “senkrecht” and        parallel unit vectors, of the target polarization coordinate        system.

Similar notation with subscript P instead of T applies to the pupilcoordinate system illustrated in FIG. 13. Note that the definition ofthe target polarization coordinate system as proposed above isdiscontinuous in the origin.

In a real implementation, allowances may need to be made for tiltrelative to the apparatus coordinate system, and rotation about the Zaxis. FIG. 15 illustrates notation for a tilt around the ŷp, axis, whileFIG. 16 illustrates notation for a tilt around the {circumflex over (x)}_(P) axis. In FIG. 15, ρ _(ŷP) denotes the target tilt around the ŷ _(P)axis in unit [rad]. From the figure the following relation can bederived:

cos(φ_(P))·θ_(P)=cos(π−φ_(T))·θ_(T)+ρ _(ŷ) _(P) .

In FIG. 16, ρ _({circumflex over (x)}) _(P) denotes the target tiltaround the {circumflex over (x)} _(P) axis in unit [rad]. From thefigure the following relation can be derived:

sin(φ_(P))·θ_(P)=sin(π−φ_(T))·θ_(T)−ρ _({circumflex over (x)}) _(P)

FIG. 17 illustrates the influence of the target tilts on thepolarization coordinate systems. From the figure the following relationscan be derived:

$\left\{ {\begin{matrix}{f_{P} = {f_{T} + \rho_{{\hat{\underset{\_}{y}}}_{P}}}} \\{g_{P} = {g_{T} - \rho_{{\hat{\underset{\_}{x}}}_{P}}}}\end{matrix}.} \right.$

In FIG. 18 ρ _({circumflex over (z)}) _(T) denotes the target rotationin unit [rad]. Note that alignment target rotations of ρ_({circumflex over (z)}) _(T) ={−45°, 0°, 45°, 90° } are common. Notethat the target polarization coordinate system {circumflex over (p)}_(T) and ŝ _(T), is invariant under rotation (around the {circumflexover (z)} _(T) axis).

Coordinate system transformations from the pupil spatial polarcoordinate system to the target spatial spherical coordinate system andvice versa can be derived. Without going into the detailed derivation,the mapping SP2T: (θ_(P),φ_(P))→(θ_(T),φ_(T)) (Spatial Pupil To Target)can be shown to be

$\left( {\theta_{T},\varphi_{T}} \right) = {{{SP}\; 2\; {T\left( {\theta_{P},\phi_{P},\rho_{{\hat{\underset{\_}{x}}}_{P}},\rho_{{\hat{\underset{\_}{y}}}_{P}},\rho_{{\hat{\underset{\_}{z}}}_{T}}} \right)}} = \left\{ {\begin{matrix}{\varphi_{T} = \left\{ \begin{matrix}{\pi - {{atan}\; 2\left( {b,a} \right)} + \rho_{{\hat{\underset{\_}{z}}}_{T}}} & {{{if}\mspace{14mu} \theta_{T}} > 0} \\0 & {{{if}\mspace{14mu} \theta_{T}} = 0}\end{matrix} \right.} \\{\theta_{T} = \sqrt{a^{2} + b^{2}}} \\{a = {{{\cos \left( \phi_{P} \right)} \cdot \theta_{P}} - \rho_{{\hat{\underset{\_}{y}}}_{P}}}} \\{b = {{{\sin \left( \phi_{P} \right)} \cdot \theta_{P}} + \rho_{{\hat{\underset{\_}{x}}}_{P}}}}\end{matrix}.} \right.}$

The mapping ST2P: (θ_(T),φ_(T))→(θ_(P),φ_(P)) (Spatial Target To Pupil)can be shown to be

$\begin{matrix}{\left( {\theta_{P},\phi_{P}} \right) = {{ST}\; 2{P\left( {\theta_{T},\varphi_{T},\rho_{{\underset{\_}{\hat{x}}}_{P}},\rho_{{\underset{\_}{\hat{y}}}_{P}},\rho_{{\underset{\_}{\hat{z}}}_{T}}} \right)}}} \\{= \left\{ {\begin{matrix}{\phi_{P} = \left\{ \begin{matrix}{a\; \tan \; 2\left( {\beta,\alpha} \right)} & {{{if}\mspace{14mu} \theta_{P}} > 0} \\0 & {{{if}\mspace{14mu} \theta_{P}} = 0}\end{matrix} \right.} \\{\theta_{P} = \sqrt{\alpha^{2} + \beta^{2}}} \\{\alpha = {{\cos \; {\left( {\pi - \varphi_{T} + \rho_{{\underset{\_}{\hat{z}}}_{T}}} \right) \cdot \theta_{T}}} + \rho_{{\underset{\_}{\hat{y}}}_{P}}}} \\{\beta = {{{\sin \left( {\pi - \varphi_{T} + \rho_{{\underset{\_}{\hat{z}}}_{T}}} \right)} \cdot \theta_{T}} - \rho_{{\underset{\_}{\hat{x}}}_{P}}}}\end{matrix}.} \right.}\end{matrix}$

(Note that the subscript ‘P’ here indicates pupil plane, and is not tobe confused with the variable P that is the period of the targetgrating.)

Further, coordinate system transformations from the incident ray in thetarget spherical coordinate system to the reflected/diffracted ray inthe target spherical coordinate system can be derived. The unknownmapping SI2RD: (θ′_(T),φ′_(T),n)→(θ″_(T),φ″_(T)) (Spatial Incident ToReflected/Diffracted) can be derived as the following:

$\begin{matrix}{\left( {\theta_{T}^{''},\varphi_{T}^{''}} \right) = {{SI}\; 2{{RD}\left( {\theta_{T}^{\prime},\varphi_{T}^{\prime},\lambda_{0},P,v} \right)}}} \\{= \left\{ {\begin{matrix}{\theta_{T}^{''} = \left\{ \begin{matrix}{a\; {\sin \left( \sqrt{f_{T}^{''\; 2} + g_{T}^{''\; 2}} \right)}} & {{{if}\mspace{14mu} \sqrt{f_{T}^{''\; 2} + g_{T}^{''\; 2}}}\; \leq 1} \\{a\; {\sin (1)}} & {otherwise}\end{matrix} \right.} \\{\varphi_{T}^{''} = {{a\; \tan \; 2\left( {g_{T}^{''},f_{T}^{''}} \right)} + \pi}} \\{f_{T}^{''} = {{{\cos \left( \varphi_{T}^{\prime} \right)} \cdot {\sin \left( \theta_{T}^{\prime} \right)}} - \frac{v \cdot \lambda_{0}}{P}}} \\{g_{T}^{''} = {{\sin \left( \varphi_{T}^{\prime} \right)} \cdot {\sin \left( \theta_{T}^{\prime} \right)}}}\end{matrix}.} \right.}\end{matrix}$

In this transformation: νε{−N,N} denotes the diffraction order (notingthat ν=0 refers to the reflected order, and ν≠0 refers to the diffracted(higher) orders); P>0 again denotes the target grating pitch and λ₀>0denotes the incident plane wave wavelength in vacuum (typical values forthis application are 400 nm≦λ₀≦1100 nm).

Note that the radial coordinate √{square root over (f″_(T) ²+g″_(T) ²)}is clipped to one.

Coordinate system transformations from the pupil polarization coordinatesystem to the target polarization coordinate system and vice versa canbe derived. First, the counterclockwise rotation matrix is defined as:

${\underset{\_}{\underset{\_}{\chi}}(\chi)} = {\begin{bmatrix}{\cos (\chi)} & {- {\sin (\chi)}} \\{\sin (\chi)} & {\cos (\chi)}\end{bmatrix}.}$

The unknown mapping PP2T: ({circumflex over (p)} _(P),ŝ_(P))→({circumflex over (p)} _(T),ŝ _(T)) (Polarization Pupil To Target)can be derived as:

$\begin{matrix}{\left( {{\underset{\_}{\hat{p}}}_{T},{\underset{\_}{\hat{s}}}_{T}} \right) = {{PP}\; 2{T\left( {{\underset{\_}{\hat{p}}}_{P},{\underset{\_}{\hat{s}}}_{P}} \right)}}} \\{= \left\{ {\begin{matrix}{{\underset{\_}{\hat{p}}}_{T} = {{{\underset{\_}{\underset{\_}{\chi}}}_{P\; 2T}\left( {f_{T},g_{T},\rho_{{\underset{\_}{\hat{x}}}_{P}},\rho_{{\underset{\_}{\hat{y}}}_{P}}} \right)} \cdot {\underset{\_}{\hat{p}}}_{P}}} \\{{\underset{\_}{\hat{s}}}_{T} = {{{\underset{\_}{\underset{\_}{\chi}}}_{P\; 2T}\left( {f_{T},g_{T},\rho_{{\underset{\_}{\hat{x}}}_{P}},\rho_{{\underset{\_}{\hat{y}}}_{P}}} \right)} \cdot {\underset{\_}{\hat{s}}}_{P}}} \\{{{\underset{\_}{\underset{\_}{\chi}}}_{P\; 2T}\left( {f_{T},g_{T},\rho_{{\underset{\_}{\hat{x}}}_{P}},\rho_{{\underset{\_}{\hat{y}}}_{P}}} \right)} = \begin{bmatrix}{\cos \left( \chi_{P\; 2\; T} \right)} & {- {\sin \left( \chi_{P\; 2T} \right)}} \\{\sin \left( \chi_{P\; 2T} \right)} & {\cos \left( \chi_{P\; 2T} \right)}\end{bmatrix}} \\{\chi_{P\; 2T} = {{a\; \tan \; 2\left( {{- g_{T}},f_{T}} \right)} - {a\; \tan \; 2\left( {{{- g_{T}} - \rho_{{\underset{\_}{\hat{x}}}_{P}}},{f_{T} + \rho_{{\underset{\_}{\hat{y}}}_{P}}}} \right)}}}\end{matrix}.} \right.}\end{matrix}$

The mapping PT2P: ({circumflex over (p)} _(T),ŝ _(T))→({circumflex over(p)} _(P),ŝ _(P)) (Polarization Target To Pupil) can be derived as:

$\begin{matrix}{\left( {{\underset{\_}{\hat{p}}}_{P},{\underset{\_}{\hat{s}}}_{P}} \right) = {{PT}\; 2{P\left( {{\underset{\_}{\hat{p}}}_{T},{\underset{\_}{\hat{s}}}_{T}} \right)}}} \\{= \left\{ {\begin{matrix}{{\underset{\_}{\hat{p}}}_{P} = {{{\underset{\_}{\underset{\_}{\chi}}}_{T\; 2P}\left( {f_{T},g_{T},\rho_{{\underset{\_}{\hat{x}}}_{P}},\rho_{{\underset{\_}{\hat{y}}}_{P}}} \right)} \cdot {\underset{\_}{\hat{p}}}_{T}}} \\{{\underset{\_}{\hat{s}}}_{P} = {{{\underset{\_}{\underset{\_}{\chi}}}_{T\; 2P}\left( {f_{T},g_{T},\rho_{{\underset{\_}{\hat{x}}}_{P}},\rho_{{\underset{\_}{\hat{y}}}_{P}}} \right)} \cdot {\underset{\_}{\hat{s}}}_{T}}} \\{{{\underset{\_}{\underset{\_}{\chi}}}_{T\; 2P}\left( {f_{T},g_{T},\rho_{{\underset{\_}{\hat{x}}}_{P}},\rho_{{\underset{\_}{\hat{y}}}_{P}}} \right)} = \left( {{\underset{\_}{\underset{\_}{\chi}}}_{P\; 2T}\left( {f_{T},g_{T},\rho_{{\underset{\_}{\hat{x}}}_{P}},\rho_{{\underset{\_}{\hat{y}}}_{P}}} \right)} \right)^{T}} \\{{{\underset{\_}{\underset{\_}{\chi}}}_{P\; 2T}\left( {f_{T},g_{T},\rho_{{\underset{\_}{\hat{x}}}_{P}},\rho_{{\underset{\_}{\hat{y}}}_{P}}} \right)} = \begin{bmatrix}{\cos \left( \chi_{P\; 2T} \right)} & {- {\sin \left( \chi_{P\; 2T} \right)}} \\{\sin \left( \chi_{P\; 2T} \right)} & {\cos \left( \chi_{P\; 2T} \right)}\end{bmatrix}} \\{\chi_{P\; 2T} = {{a\; \tan \; 2\left( {{- g_{T}},f_{T}} \right)} - {a\; \tan \; 2\left( {{{- g_{T}} - \rho_{{\underset{\_}{\hat{x}}}_{P}}},{f_{T} + \rho_{{\underset{\_}{\hat{y}}}_{P}}}} \right)}}}\end{matrix}.} \right.}\end{matrix}$

The mapping PPPS2XY: ({circumflex over (p)} _(P),ŝ _(P))→({circumflexover (x)} _(P),ŷ _(P)) (Polarization Pupil Parallel Senkrecht To X Y)can be derived as:

$\begin{matrix}{\left( {{\underset{\_}{\hat{x}}}_{P},{\underset{\_}{\hat{y}}}_{P}} \right) = {{PPPS}\; 2{{XY}\left( {{\underset{\_}{\hat{p}}}_{P},{\underset{\_}{\hat{s}}}_{P}} \right)}}} \\{= \left\{ {\begin{matrix}{{\underset{\_}{\hat{x}}}_{P} = {{{\underset{\_}{\underset{\_}{\Gamma}}}_{{ps}\; 2{xy}}\left( {g_{P},f_{P}} \right)} \cdot {\underset{\_}{\hat{p}}}_{P}}} \\{{\underset{\_}{\hat{y}}}_{P} = {{{\underset{\_}{\underset{\_}{\Gamma}}}_{{ps}\; 2{xy}}\left( {g_{P},f_{P}} \right)} \cdot {\underset{\_}{\hat{s}}}_{P}}} \\{{{\underset{\_}{\underset{\_}{\Gamma}}}_{{ps}\; 2{xy}}\left( {g_{P},f_{P}} \right)} = {\begin{bmatrix}{- 1} & 0 \\0 & 1\end{bmatrix} \cdot {{\underset{\_}{\underset{\_}{\chi}}}_{{ps}\; 2{xy}}\left( {g_{P},f_{P}} \right)}}} \\{{{\underset{\_}{\underset{\_}{\chi}}}_{{ps}\; 2{xy}}\left( {g_{P},f_{P}} \right)} = \begin{bmatrix}{\cos \left( \chi_{{ps}\; 2{xy}} \right)} & {- {\sin \left( \chi_{{ps}\; 2{xy}} \right)}} \\{\sin \; \left( \chi_{{ps}\; 2{xy}} \right)} & {\cos \left( \chi_{{ps}\; 2{xy}} \right)}\end{bmatrix}} \\{\chi_{{ps}\; 2{xy}} = {{{- 1} \cdot a}\; \tan \; 2\left( {g_{P},f_{P}} \right)}}\end{matrix}.} \right.}\end{matrix}$

The mapping PPXY2PS: ({circumflex over (x)} _(P),ŷ _(P))→({circumflexover (p)} _(P),ŝ _(P)) (Polarization Pupil X Y To Parallel Senkrecht)can be derived as:

$\begin{matrix}{\left( {{\underset{\_}{\hat{p}}}_{P},{\underset{\_}{\hat{s}}}_{P}} \right) = {{PPXY}\; 2{{PS}\left( {{\underset{\_}{\hat{x}}}_{P},{\underset{\_}{\hat{y}}}_{P}} \right)}}} \\{= \left\{ {\begin{matrix}{{\underset{\_}{\hat{p}}}_{P} = {{{\underset{\_}{\underset{\_}{\Gamma}}}_{{xy}\; 2{ps}}\left( {g_{P},f_{P}} \right)} \cdot {\underset{\_}{\hat{x}}}_{P}}} \\{{\underset{\_}{\hat{s}}}_{P} = {{{\underset{\_}{\underset{\_}{\Gamma}}}_{{xy}\; 2{ps}}\left( {g_{P},f_{P}} \right)} \cdot {\underset{\_}{\hat{y}}}_{P}}} \\{{{\underset{\_}{\underset{\_}{\Gamma}}}_{{xy}\; 2{ps}}\left( {g_{P},f_{P}} \right)} = \left( {{\underset{\_}{\underset{\_}{\Gamma}}}_{{ps}\; 2{xy}}\left( {g_{P},f_{P}} \right)} \right)^{T}} \\{{{\underset{\_}{\underset{\_}{\Gamma}}}_{{ps}2{xy}}\left( {g_{P},f_{P}} \right)} = {\begin{bmatrix}{- 1} & 0 \\0 & 1\end{bmatrix} \cdot {{\underset{\_}{\underset{\_}{\chi}}}_{{ps}\; 2{xy}}\left( {g_{P},f_{P}} \right)}}} \\{{{\underset{\_}{\underset{\_}{\chi}}}_{{ps}\; 2{xy}}\left( {g_{P},f_{P}} \right)} = \begin{bmatrix}{\cos \left( \chi_{{ps}\; 2{xy}} \right)} & {- {\sin \left( \chi_{{ps}\; 2{xy}} \right)}} \\{\sin \; \left( \chi_{{ps}\; 2{xy}} \right)} & {\cos \left( \chi_{{ps}\; 2{xy}} \right)}\end{bmatrix}} \\{\chi_{{ps}\; 2{xy}} = {{{- 1} \cdot a}\; \tan \; 2\left( {g_{P},f_{P}} \right)}}\end{matrix}.} \right.}\end{matrix}$

Having defined coordinate systems and transformations, it is assumedthat the (complex) pupil plane electric field amplitudes areknown/given. These (complex) pupil plane electric field amplitudes canbe computed using any suitable approach. In the present implementation,a Jones calculus model is used to compute these fields given theillumination field (in terms of wavelength, angle and polarization) andgiven coefficients of reflection and diffraction of the target structure(alignment mark, substrate and overlying stack). These coefficients canbe computed by solving Maxwell's equations for a model of the target andsurrounding materials. The equations can be solved for example by thewell-known technique of RCWA (rigorous coupled-wave analysis). This(complex) pupil plane electric field amplitude, in the {circumflex over(x)} _(P) and ŷ _(P) polarization coordinate system, can be denoted bythe following equation

${\underset{\_}{E}}_{P,v} = {\begin{bmatrix}E_{P,v,{\underset{\_}{\hat{x}}}_{P}} \\E_{P,v,{\underset{\_}{\hat{y}}}_{P}}\end{bmatrix}.}$

Again, νε{N,N} denotes the diffraction order, ν=0 refers to thereflected order, and ν≠0 refers to the diffracted orders. How tocalculate the intensity as seen by the detectors 430A, 430B in thealignment sensor will now be discussed. As mentioned already, the caseof off-axis illumination will be considered. The case of on-axisillumination can be derived as a special case. Note that in the off-axisilluminated alignment sensor, both illumination rays (spots in the pupilplane) are mutually coherent and in phase. Hence the (complex) electricfield amplitudes, of positive and negative diffraction orders, as summedbelow, may originate from a different incident plane wave. Note it willbe assumed here that the target tilts are zero.

The (complex) pupil plane electric field amplitude, as a function of the(stage) scan position equals

${{\underset{\_}{E}}_{P,v}\left( x_{stage} \right)} = {{\exp \left( \frac{{- } \cdot 2 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)} \cdot {{{\underset{\_}{E}}_{P,v}\left( {x_{stage} = 0} \right)}.}}$

where x_(stage) denotes the scan x-position of, for example, thesubstrate table WT. Note that it is assumed here that the scanningmovement is pointing in the direction of (i.e. parallel to) the{circumflex over (x)} _(P) direction. Again, P>0 denotes the targetgrating pitch.

Note that the phase term exp

$\left( \frac{{- } \cdot 2 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)$

as introduced above, can be derived from a Fourier optics treatment(i.e. a Fourier series expansion) of the alignment target. Thecounterclockwise rotation matrix is defined again to equal

${\underset{\_}{\underset{\_}{\chi}}(\chi)} = {\begin{bmatrix}{\cos (\chi)} & {- {\sin (\chi)}} \\{\sin (\chi)} & {\cos (\chi)}\end{bmatrix}.}$

The (complex) electric field amplitudes, after passing, in order, thehalf-wave plate 510, the self-referencing interferometer 428 and thephase compensator 512, are the following:

$\begin{matrix}\left\{ \begin{matrix}{{{\underset{\_}{E}}_{U,v}\left( x_{stage} \right)} = {{{\underset{\_}{\underset{\_}{\chi}}\left( 90^{\circ} \right)} \cdot \begin{bmatrix}1 & 0 \\0 & 0\end{bmatrix} \cdot {\underset{\_}{\underset{\_}{\chi}}\left( 22.5^{\circ} \right)} \cdot \begin{bmatrix}1 & 0 \\0 & {- 1}\end{bmatrix} \cdot {\underset{\_}{\underset{\_}{\chi}}\left( 22.5^{\circ} \right)} \cdot {{\underset{\_}{E}}_{P,v}\left( x_{stage} \right)}} +}} \\{{\underset{\_}{\underset{\_}{\chi}}\left( {- 90^{\circ}} \right)} \cdot \begin{bmatrix}0 & 0 \\0 & 1\end{bmatrix} \cdot {\underset{\_}{\underset{\_}{\chi}}\left( 22.5^{\circ} \right)} \cdot \begin{bmatrix}1 & 0 \\0 & {- 1}\end{bmatrix} \cdot {\underset{\_}{\underset{\_}{\chi}}\left( {- 22.5^{\circ}} \right)} \cdot {{\underset{\_}{E}}_{P,{- v}}\left( x_{stage} \right)}} \\\begin{matrix}{{{\underset{\_}{E}}_{L,v}\left( x_{stage} \right)} = {{{\underset{\_}{\underset{\_}{\chi}}\left( {- 90^{\circ}} \right)} \cdot \begin{bmatrix}0 & 0 \\0 & 1\end{bmatrix} \cdot \underset{\_}{\underset{\_}{\chi}}}{\left( 22.5^{\circ} \right) \cdot}}} \\{{\begin{bmatrix}1 & 0 \\0 & {- 1}\end{bmatrix} \cdot {\underset{\_}{\underset{\_}{\chi}}\left( {- 22.5^{\circ}} \right)} \cdot {{\underset{\_}{E}}_{P,v}\left( x_{stage} \right)}} +}\end{matrix} \\{{\underset{\_}{\underset{\_}{\chi}}\left( 90^{\circ} \right)} \cdot \begin{bmatrix}1 & 0 \\0 & 0\end{bmatrix} \cdot {\underset{\_}{\underset{\_}{\chi}}\left( 22.5^{\circ} \right)} \cdot \begin{bmatrix}1 & 0 \\0 & {- 1}\end{bmatrix} \cdot {\underset{\_}{\underset{\_}{\chi}}\left( {- 22.5^{\circ}} \right)} \cdot {{\underset{\_}{E}}_{P,{- v}}\left( x_{stage} \right)}}\end{matrix} \right. \\{\mspace{79mu} \left\{ {\begin{matrix}\begin{matrix}{{{\underset{\_}{E}}_{U,v}\left( x_{stage} \right)} = {{{\frac{1}{\sqrt{2}} \cdot \begin{bmatrix}0 & 0 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{P,v}}\left( x_{stage} \right)} +}} \\{\frac{1}{\sqrt{2}} \cdot \begin{bmatrix}1 & {- 1} \\0 & 0\end{bmatrix} \cdot {{\underset{\_}{E}}_{P,{- v}}\left( x_{stage} \right)}}\end{matrix} \\\begin{matrix}{{{\underset{\_}{E}}_{L,v}\left( x_{stage} \right)} = {{{\frac{1}{\sqrt{2}} \cdot \begin{bmatrix}1 & {- 1} \\0 & 0\end{bmatrix} \cdot {\underset{\_}{E}}_{P,v}}\left( x_{stage} \right)} +}} \\{\frac{1}{\sqrt{2}} \cdot \begin{bmatrix}0 & 0 \\1 & 1\end{bmatrix} \cdot {{\underset{\_}{E}}_{P,{- v}}\left( x_{stage} \right)}}\end{matrix}\end{matrix}.} \right.}\end{matrix}$

Note that the indices ν and −ν are no longer applicable after passingthe self-referencing interferometer, and hence have been replaced by Uand L. Referring again to FIGS. 5 to 8, the U and L components may beregarded as the +1/−1 and −1/+1 combined components, for the case whereν=1, and similarly for all νε{−N,N}.

Note that it is assumed above that linear x- and y-polarizedillumination radiation is being supplied to the target. The rotation of45° is then effected by half-wave plate 510 at the input side ofinterferometer 428 (both shown in FIG. 9). If in another example linear22.5° polarized illumination radiation is used, then the orientation ofthe half-wave plate is altered to ensure that the radiation that entersthe self-referencing interferometer will be linearly polarized at ±45°.(This condition is desirable with the particular interferometer used inthese examples, so that both internal “channels” in the self-referencinginterferometer are excited evenly.)

Generalizing the above expressions to allow for an arbitrarypolarization rotation to be applied by half-wave plate 510 results in:

$\quad\left\{ \begin{matrix}{{{\underset{\_}{E}}_{U,v}\left( x_{stage} \right)} = {{{\underset{\_}{\underset{\_}{\chi}}\left( 90^{\circ} \right)} \cdot \begin{bmatrix}1 & 0 \\0 & 0\end{bmatrix} \cdot {\underset{\_}{\underset{\_}{\chi}}(\gamma)} \cdot \begin{bmatrix}1 & 0 \\0 & {- 1}\end{bmatrix} \cdot {\underset{\_}{\underset{\_}{\chi}}\left( {- \gamma} \right)} \cdot {{\underset{\_}{E}}_{P,v}\left( x_{stage} \right)}} +}} \\{{\underset{\_}{\underset{\_}{\chi}}\left( {- 90^{\circ}} \right)} \cdot \begin{bmatrix}0 & 0 \\0 & 1\end{bmatrix} \cdot {\underset{\_}{\underset{\_}{\chi}}(\gamma)} \cdot \begin{bmatrix}1 & 0 \\0 & {- 1}\end{bmatrix} \cdot {\underset{\_}{\underset{\_}{\chi}}\left( {- \gamma} \right)} \cdot {{\underset{\_}{E}}_{P,{- v}}\left( x_{stage} \right)}} \\\begin{matrix}{{{\underset{\_}{E}}_{L,v}\left( x_{stage} \right)} = {{\underset{\_}{\underset{\_}{\chi}}\left( {- 90^{\circ}} \right)} \cdot \begin{bmatrix}0 & 0 \\0 & 1\end{bmatrix} \cdot {\underset{\_}{\underset{\_}{\chi}}(\gamma)} \cdot}} \\{{\begin{bmatrix}1 & 0 \\0 & {- 1}\end{bmatrix} \cdot {\underset{\_}{\underset{\_}{\chi}}\left( {- \gamma} \right)} \cdot {{\underset{\_}{E}}_{P,v}\left( x_{stage} \right)}} +}\end{matrix} \\{{\underset{\_}{\underset{\_}{\chi}}\left( 90^{\circ} \right)} \cdot \begin{bmatrix}1 & 0 \\0 & 0\end{bmatrix} \cdot {\underset{\_}{\underset{\_}{\chi}}(\gamma)} \cdot \begin{bmatrix}1 & 0 \\0 & {- 1}\end{bmatrix} \cdot {\underset{\_}{\underset{\_}{\chi}}\left( {- \gamma} \right)} \cdot {{\underset{\_}{E}}_{P,{- v}}\left( x_{stage} \right)}}\end{matrix} \right.$

where γ denotes the (counter-clockwise) the half-wave plate 510 (fastaxis) located before the self-referencing interferometer. Introducingshorthand matrices

${\underset{\_}{\underset{\_}{\Pi}}}_{A,\gamma} = {{{\underset{\_}{\underset{\_}{\chi}}\left( 90^{\circ} \right)} \cdot \begin{bmatrix}1 & 0 \\0 & 0\end{bmatrix} \cdot {\underset{\_}{\underset{\_}{\chi}}(\gamma)} \cdot \begin{bmatrix}1 & 0 \\0 & {- 1}\end{bmatrix} \cdot {\underset{\_}{\underset{\_}{\chi}}\left( {- \gamma} \right)}}\mspace{14mu} {and}}$${\underset{\_}{\underset{\_}{\Pi}}}_{B,\gamma} = {{{\underset{\_}{\underset{\_}{\chi}}\left( 90^{\circ} \right)} \cdot \begin{bmatrix}0 & 0 \\0 & 1\end{bmatrix} \cdot {\underset{\_}{\underset{\_}{\chi}}(\gamma)} \cdot \begin{bmatrix}1 & 0 \\0 & {- 1}\end{bmatrix} \cdot \underset{\_}{\underset{\_}{\chi}}}\left( {- \gamma} \right)}$

the expressions for the detector-level amplitudes can rewritten as:

$\left\{ {\begin{matrix}{{{\underset{\_}{E}}_{U,v}\left( x_{stage} \right)} = {{{\underset{\_}{\underset{\_}{\Pi}}}_{A,\gamma} \cdot {{\underset{\_}{E}}_{P,v}\left( x_{stage} \right)}} + {{\underset{\_}{\underset{\_}{\Pi}}}_{B,\gamma} \cdot {{\underset{\_}{E}}_{P,{- v}}\left( x_{stage} \right)}}}} \\{{{\underset{\_}{E}}_{L,v}\left( x_{stage} \right)} = {{{\underset{\_}{\underset{\_}{\Pi}}}_{B,\gamma} \cdot {{\underset{\_}{E}}_{P,v}\left( x_{stage} \right)}} + {{\underset{\_}{\underset{\_}{\Pi}}}_{A,\gamma} \cdot {{\underset{\_}{E}}_{P,{- v}}\left( x_{stage} \right)}}}}\end{matrix}.} \right.$

Further, by applying the above-defined expression for the complex pupilplane electric field amplitude, as a function of the scan position andby introducing shorthand notation for electric fields E _(A,ν)=Π_(A,γ)·E _(P,ν)(x_(stage)=0) and E _(B,ν)=Π _(B,γ)·E_(P,ν)(x_(stage)=0), the following expressions can be derived:

$\quad\left\{ \begin{matrix}\begin{matrix}{{{\underset{\_}{E}}_{U,v}\left( x_{stage} \right)} = {{{\exp \left( \frac{{- } \cdot 2 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)} \cdot {\underset{\_}{E}}_{A,v}} +}} \\{{\exp \left( \frac{ \cdot 2 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)} \cdot {\underset{\_}{E}}_{B,{- v}}}\end{matrix} \\\begin{matrix}{{{\underset{\_}{E}}_{L,v}\left( x_{stage} \right)} = {{{\exp \left( \frac{{- } \cdot 2 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)} \cdot {\underset{\_}{E}}_{B,v}} +}} \\{{\exp \left( \frac{ \cdot 2 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)} \cdot {\underset{\_}{E}}_{A,{- v}}}\end{matrix}\end{matrix} \right.$

Recall from FIG. 9 that, for each color, sum and difference signals areseparately carried by the branches A and B. Which branch carries whichsignal is set for each color by the input polarization. The sum anddifference (complex) electric field amplitudes at detector 430A/430B canbe computed by propagating the electric fields E _(U,ν)(x_(stage)) and E_(L,ν)(x_(stage)) (after the self-referencing interferometer) in thefollowing order through the half wave plate 514, the polarizing beamsplitter 516 and the collector lens assemblies 484A/484B onto the fiberentrance. The result is the following:

$\left\{ {\begin{matrix}{{{\underset{\_}{E}}_{D,{sum},U,v}\left( x_{stage} \right)} = {\begin{bmatrix}1 & 0 \\0 & 0\end{bmatrix} \cdot {\underset{\_}{\underset{\_}{\chi}}\left( 22.5^{\circ} \right)} \cdot \begin{bmatrix}1 & 0 \\0 & {- 1}\end{bmatrix} \cdot {\underset{\_}{\underset{\_}{\chi}}\left( {- 22.5^{\circ}} \right)} \cdot {{\underset{\_}{E}}_{U,v}\left( x_{stage} \right)}}} \\{{{\underset{\_}{E}}_{D,{sum},L,v}\left( x_{stage} \right)} = {\begin{bmatrix}1 & 0 \\0 & 0\end{bmatrix} \cdot {\underset{\_}{\underset{\_}{\chi}}\left( 22.5^{\circ} \right)} \cdot \begin{bmatrix}1 & 0 \\0 & {- 1}\end{bmatrix} \cdot {\underset{\_}{\underset{\_}{\chi}}\left( {- 22.5^{\circ}} \right)} \cdot {{\underset{\_}{E}}_{L,v}\left( x_{stage} \right)}}} \\{{{\underset{\_}{E}}_{D,{diff},U,v}\left( x_{stage} \right)} = {\begin{bmatrix}0 & 0 \\0 & 1\end{bmatrix} \cdot {\underset{\_}{\underset{\_}{\chi}}\left( 22.5^{\circ} \right)} \cdot \begin{bmatrix}1 & 0 \\0 & {- 1}\end{bmatrix} \cdot {\underset{\_}{\underset{\_}{\chi}}\left( {- 22.5^{\circ}} \right)} \cdot {{\underset{\_}{E}}_{U,v}\left( x_{stage} \right)}}} \\{{{\underset{\_}{E}}_{D,{diff},L,v}\left( x_{stage} \right)} = {\begin{bmatrix}0 & 0 \\0 & 1\end{bmatrix} \cdot {\underset{\_}{\underset{\_}{\chi}}\left( 22.5^{\circ} \right)} \cdot \begin{bmatrix}1 & 0 \\0 & {- 1}\end{bmatrix} \cdot {\underset{\_}{\underset{\_}{\chi}}\left( {- 22.5^{\circ}} \right)} \cdot {{\underset{\_}{E}}_{L,v}\left( x_{stage} \right)}}}\end{matrix}\mspace{79mu} \left\{ {\begin{matrix}{{{\underset{\_}{E}}_{D,{sum},U,v}\left( x_{stage} \right)} = {\frac{1}{\sqrt{2}} \cdot \begin{bmatrix}1 & 1 \\0 & 0\end{bmatrix} \cdot {{\underset{\_}{E}}_{U,v}\left( x_{stage} \right)}}} \\{{{\underset{\_}{E}}_{D,{sum},L,v}\left( x_{stage} \right)} = {\frac{1}{\sqrt{2}} \cdot \begin{bmatrix}1 & 1 \\0 & 0\end{bmatrix} \cdot {{\underset{\_}{E}}_{L,v}\left( x_{stage} \right)}}} \\{{{\underset{\_}{E}}_{D,{diff},U,v}\left( x_{stage} \right)} = {\frac{1}{\sqrt{2}} \cdot \begin{bmatrix}0 & 0 \\1 & {- 1}\end{bmatrix} \cdot {{\underset{\_}{E}}_{U,v}\left( x_{stage} \right)}}} \\{{{\underset{\_}{E}}_{D,{diff},L,v}\left( x_{stage} \right)} = {\frac{1}{\sqrt{2}} \cdot \begin{bmatrix}0 & 0 \\1 & {- 1}\end{bmatrix} \cdot {{\underset{\_}{E}}_{L,v}\left( x_{stage} \right)}}}\end{matrix}.} \right.} \right.$

The sum and difference (alignment) detector intensities can now becomputed by summing the contributions from the different diffractionorders as shown in the following equations:

$\begin{matrix}\left\{ \begin{matrix}{{I_{D,{sum}}\left( x_{stage} \right)} = {\sum\limits_{{v = {- N}}{v \neq 0}}^{N}\; {\langle{{{\underset{\_}{E}}_{D,{sum},v}\left( x_{stage} \right)},{{\underset{\_}{E}}_{D,{sum},v}\left( x_{stage} \right)}}\rangle}}} \\{{I_{D,{diff}}\left( x_{stage} \right)} = {\sum\limits_{{v = {- N}}{v \neq 0}}^{N}\; {\langle{{{\underset{\_}{E}}_{D,{diff},v}\left( x_{stage} \right)},{{\underset{\_}{E}}_{D,{diff},v}\left( x_{stage} \right)}}\rangle}}}\end{matrix} \right. \\\left\{ \begin{matrix}\begin{matrix}{{I_{D,{sum}}\left( x_{stage} \right)} = {{\sum\limits_{{v = {- N}}{v \neq 0}}^{N}\; {\langle{{{\underset{\_}{E}}_{D,{sum},U,v}\left( x_{stage} \right)},{{\underset{\_}{E}}_{D,{sum},U,v}\left( x_{stage} \right)}}\rangle}} +}} \\{\sum\limits_{{v = {- N}}{v \neq 0}}^{N}\; {\langle{{{\underset{\_}{E}}_{D,{sum},L,v}\left( x_{stage} \right)},{{\underset{\_}{E}}_{D,{sum},L,v}\left( x_{stage} \right)}}\rangle}}\end{matrix} \\\begin{matrix}{{I_{D,{diff}}\left( x_{stage} \right)} = {{\sum\limits_{{v = {- N}}{v \neq 0}}^{N}\; {\langle{{{\underset{\_}{E}}_{D,{diff},U,v}\left( x_{stage} \right)},{{\underset{\_}{E}}_{D,{diff},U,v}\left( x_{stage} \right)}}\rangle}} +}} \\{\sum\limits_{{v = {- N}}{v \neq 0}}^{N}\; {\langle{{{\underset{\_}{E}}_{D,{diff},L,v}\left( x_{stage} \right)},{{\underset{\_}{E}}_{D,{diff},L,v}\left( x_{stage} \right)}}\rangle}}\end{matrix}\end{matrix} \right. \\\left\{ \begin{matrix}\begin{matrix}{{I_{D,{sum}}\left( x_{stage} \right)} = {{\sum\limits_{{v = {- N}}{v \neq 0}}^{N}{{{{\underset{\_}{E}}_{D,{sum},U,v}^{H}\left( x_{stage} \right)} \cdot {\underset{\_}{E}}_{D,{sum},U,v}}\left( x_{stage} \right)}} +}} \\{{{\underset{\_}{E}}_{D,{sum},L,v}^{H}\left( x_{stage} \right)} \cdot {{\underset{\_}{E}}_{D,{sum},L,v}\left( x_{stage} \right)}}\end{matrix} \\\begin{matrix}{{I_{D,{diff}}\left( x_{stage} \right)} = {{\sum\limits_{{v = {- N}}{v \neq 0}}^{N}{{{{\underset{\_}{E}}_{D,{diff},U,v}^{H}\left( x_{stage} \right)} \cdot {\underset{\_}{E}}_{D,{diff},U,v}}\left( x_{stage} \right)}} +}} \\{{{\underset{\_}{E}}_{D,{diff},L,v}^{H}\left( x_{stage} \right)} \cdot {{\underset{\_}{E}}_{D,{diff},L,v}\left( x_{stage} \right)}}\end{matrix}\end{matrix} \right.\end{matrix}$

which can be expanded to

${I_{D,{sum}}\left( x_{stage} \right)} = {\sum\limits_{{v = {- N}}{v \neq 0}}^{N}\begin{pmatrix}{\begin{matrix}{\begin{pmatrix}{{{\exp \left( \frac{ \cdot 2 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)} \cdot {\underset{\_}{E}}_{A,v}^{H}} +} \\{{\exp \left( \frac{ \cdot 2 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)} \cdot {\underset{\_}{E}}_{B,{- v}}^{H}}\end{pmatrix}{\frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot}} \\\begin{pmatrix}{{{\exp \left( \frac{ \cdot 2 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)} \cdot {\underset{\_}{E}}_{A,v}^{H}} +} \\{{\exp \left( \frac{ \cdot 2 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)} \cdot {\underset{\_}{E}}_{B,{- v}}^{H}}\end{pmatrix}\end{matrix} +} \\\begin{matrix}{\begin{pmatrix}{{{\exp \left( \frac{ \cdot 2 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)} \cdot {\underset{\_}{E}}_{B,v}^{H}} +} \\{{\exp \left( \frac{{- } \cdot 2 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)} \cdot {\underset{\_}{E}}_{A,{- v}}^{H}}\end{pmatrix} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot} \\\begin{pmatrix}{{{\exp \left( \frac{{- } \cdot 2 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)} \cdot {\underset{\_}{E}}_{B,v}} +} \\{{\exp \left( \frac{ \cdot 2 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)} \cdot {\underset{\_}{E}}_{A,{- v}}}\end{pmatrix}\end{matrix}\end{pmatrix}}$${I_{D,{diff}}\left( x_{stage} \right)} = {\sum\limits_{{v = {- N}}{v \neq 0}}^{N}\begin{pmatrix}{\begin{matrix}{\begin{pmatrix}{{{\exp \left( \frac{ \cdot 2 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)} \cdot {\underset{\_}{E}}_{A,v}^{H}} +} \\{{\exp \left( \frac{{- } \cdot 2 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)} \cdot {\underset{\_}{E}}_{B,{- v}}^{H}}\end{pmatrix} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & {- 1} \\{- 1} & 1\end{bmatrix} \cdot} \\\begin{pmatrix}{{{\exp \left( \frac{{- } \cdot 2 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)} \cdot {\underset{\_}{E}}_{A,v}} +} \\{{\exp \left( \frac{ \cdot 2 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)} \cdot {\underset{\_}{E}}_{B,{- v}}^{H}}\end{pmatrix}\end{matrix} +} \\\begin{matrix}{\begin{pmatrix}{{{\exp \left( \frac{ \cdot 2 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)} \cdot {\underset{\_}{E}}_{B,v}^{H}} +} \\{{\exp \left( \frac{{- } \cdot 2 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)} \cdot {\underset{\_}{E}}_{B,v}^{H}}\end{pmatrix} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & {- 1} \\{- 1} & 1\end{bmatrix} \cdot} \\\begin{pmatrix}{{{\exp \left( \frac{{- } \cdot 2 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)} \cdot {\underset{\_}{E}}_{B,v}} +} \\{{\exp \left( \frac{ \cdot 2 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)} \cdot {\underset{\_}{E}}_{A,{- v}}}\end{pmatrix}\end{matrix}\end{pmatrix}}$

Whichever form of expression is used, it will be seen that each of theseintensity values, corresponding to the position-varying waveformrecorded as the spot 202 scans a target, is the summation of N differentorders, corresponding to the diffraction orders ν. Within each order ν,there are two constant terms, representing a DC component, and aperiodic term with spatial frequency 4πνf P. Comparing the sum anddifference signals, it can be seen that they are identical except thattheir periodic components are in antiphase. Note that it is assumed thatthe zeroth diffraction order (i.e. the reflected order with ν=0) isblocked somewhere along the path from objective 424 to theinterferometer, as already described above. It is further assumed thatthe detector surface is large relative to the pitch of the targetgrating. This means that the electric field amplitudes at the detectorsurface, resulting from pairs of two plane waves incident on thedetector surface, are all orthogonal on the interval defined by thedetector surface area. Hence these electric field amplitudes at thedetector surfaces, due to the different order pairs, may be summedincoherently, as has been done.

Referring to steps S42-S43 of the method in FIG. 11, which incidentallycan share processing with step S2 in FIG. 10, the estimation of the(relative) alignment position from the alignment sensor detectorintensity signal(s) I_(D,sum) (x_(stage))) and/or I_(D,diff) (x_(stage))is now discussed. From the above result one can observe that for asymmetrical grating, in which the (complex) electric field amplitudes E_(P,ν) and E _(P,−ν) are equal, the maximum of the alignment sensordetector intensity sum signal I_(D,sum)(x_(stage))) is located atx_(stage)=0. For the alignment sensor detector intensity differencesignal I_(D,diff) (x_(stage)) a minimum is located at x_(stage)=0. Thealignment position estimation is based upon this property. Clearly thezero (reference) position for this purpose is centered on some definedpart of the mark, and will itself have a certain position relative to acoordinate system of, for example, the substrate as a whole. Later arefinement is described allowing the reference position to be defined ina flexible way to suit the application, and to allow for the fact thatthe mark itself may be distorted and its reference position is notstraightforward to define.

Given for example the position-dependent sum intensity signalI_(D,sum)(x_(stage)) received from alignment sensor detector, one canestimate the (relative) phase of the term

$2 \cdot {{Re}\left( {{\exp \left( \frac{{- } \cdot 4 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)} \cdot \left( {{{\underset{\_}{E}}_{B,{- v}}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,v}} + {{\underset{\_}{E}}_{A,{- v}}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{B,v}}} \right)} \right)}$

using for example a projection or a fit approach. Using this projectionor fit approach, the alignment sensor detector intensity signal (foreach color) is decomposed by the processing unit PU using for example aFourier transform as follows:

${u_{0} + {\sum\limits_{v = 1}^{N}\; {u_{v,\cos} \cdot {\cos \left( \frac{4 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)}}} + {u_{v,\sin} \cdot {\sin \left( \frac{4 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)}}} \approx {{I_{D}\left( x_{stage} \right)}.}$

In this equation u₀ is a zero order (DC) coefficient, while u_(ν) isgenerally a νth order Fourier coefficient. For each order there is acosine coefficient u_(ν,cos) and a sine coefficient u_(ν,sin). Therelationship between these two corresponds to the phase of the periodiccomponent of that order. In physical terms, each value of νcorresponding to a diffraction order in the diffraction spectrum of thetarget grating gives rise directly to a corresponding order (harmoniccomponent) in the position-dependent detector waveform φ_(ν) Based onthe above decomposition (step S42) there are computed a phase φ_(ν) foreach order and consequently (step S43) a (relative) alignment positionx_(align,ν) using the formulae:

$\left\{ {\begin{matrix}{\phi_{v} = {a\; \tan \; 2\left( {u_{v,\sin},u_{v,\cos}} \right)}} \\{x_{{align},v} = {\frac{- P}{4 \cdot \pi \cdot v} \cdot \phi_{v}}}\end{matrix}.} \right.$

So it can be seen that one computes (or at least can compute) multiple(relative) alignment positions from each waveform, one for each positivevalue of ν, i.e. νε{1, . . . , N}. Note that in case of rigorousmodeling of the alignment mark (as opposed to a strictly Fourier opticsmodel), (relative) alignment positions can be estimated for even orders,i.e. νε{2, 4, . . . }, as the (complex) electric field amplitudes E_(P,ν) and E _(P,−ν) are (in general) non-zero for these even orders.Also, when asymmetry occurs in the alignment mark, the complex electricfield amplitudes are (in general) non-zero for these even orders, andthe even orders may carry particular information about asymmetry. Notethat one can derive the phase and hence the alignment position equallyusing either the sum signal I_(D,sum)(x_(stage)) or the differencesignal I_(D,diff)(x_(stage)), provided one takes account of the minussign in front of the periodic component. One can also use both sum anddifference signals in combination. Using both signals can improve signalto noise ratios, as they use different sets of photons and thereforetheir noise components (or at least those due for example to photon shotnoise and detector noise) should be uncorrelated.

Referring to step S44, the influence on the estimated alignment positionof photon Poisson noise at the level of the alignment sensor detectors430A, 430B is now discussed. This noise estimation allows the bestsignals to be selected for use in calculating the position measurement.In order to compute the noise sensitivity of the estimated alignmentposition for a given color, order etc., the following derivatives arecomputed:

$\left\{ {\begin{matrix}{\frac{\partial x_{{align},v}}{\partial u_{v,\cos}} = {{\frac{\partial}{\partial u_{v,\cos}} \cdot \frac{- P}{4 \cdot \pi \cdot v} \cdot a}\; \tan \; 2\left( {u_{v,\sin},u_{v,\cos}} \right)}} \\{\frac{\partial x_{{align},v}}{\partial u_{v,\sin}} = {{\frac{\partial}{\partial u_{v,\sin}} \cdot \frac{- P}{4 \cdot \pi \cdot v} \cdot a}\; \tan \; 2\left( {u_{v,\sin},u_{v,\cos}} \right)}}\end{matrix}\left\{ {\begin{matrix}{\frac{\partial x_{{align},v}}{\partial u_{v,\cos}} = {\frac{- P}{4 \cdot \pi \cdot v} \cdot \frac{- u_{v,\sin}}{u_{v,\cos}^{2} + u_{v,\sin}^{2}}}} \\{\frac{\partial x_{{align},v}}{\partial u_{v,\cos}} = {\frac{- P}{4 \cdot \pi \cdot v} \cdot \frac{- u_{v,\sin}}{u_{v,\cos}^{2} + u_{v,\sin}^{2}}}}\end{matrix}.} \right.} \right.$

It is assumed that the total number of photons at detector level withina detector integration time interval is (very) large, so that thePoisson distribution (which describes the number of photons arriving atthe detector) is approximated well by the normal (i.e. Gaussian)distribution. It is also assumed that the noise is white noise. Notethat if a discrete Fourier transform is made of a white noise signal,then all spectral components will have an expected value that equalszero, and will have an identical variance. As the periodic components

${\cos \left( \frac{4 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)}\mspace{14mu} {and}\mspace{14mu} {\sin \left( \frac{4 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)}$

will be mutually orthogonal on the scan trajectory interval, it can beconcluded that cov(u_(ν,cos),u_(ν,sin))=0. Hence the following resultcan be derived:

$\begin{matrix}{\sigma_{x_{{align},v}}^{2} = {{\frac{\partial x_{{align},v}}{\partial u_{v,\cos}} \cdot \sigma_{u_{v,\cos}}^{2} \cdot \frac{\partial x_{{align},v}}{\partial u_{v,\cos}}} + {\frac{\partial x_{{align},v}}{\partial u_{v,\sin}} \cdot \sigma_{u_{v,\sin}}^{2} \cdot \frac{\partial x_{{align},v}}{\partial u_{v,\sin}}}}} \\{= {{\left( {\frac{- P}{4 \cdot \pi \cdot v} \cdot \frac{- u_{v,\sin}}{u_{v,\cos}^{2} + u_{v,\sin}^{2}}} \right)^{2} \cdot \sigma_{u_{v,\cos}}^{2}} +}} \\{{\left( {\frac{- P}{4 \cdot \pi \cdot v} \cdot \frac{u_{v,\cos}}{u_{v,\cos}^{2} + u_{v,\sin}^{2}}} \right)^{2} \cdot \sigma_{u_{v,\sin}}^{2}}} \\{= {{\left( {\frac{P}{4 \cdot \pi \cdot v} \cdot \frac{u_{v,\sin}}{u_{v,\cos}^{2} + u_{v,\sin}^{2}}} \right)^{2} \cdot \sigma_{u_{v,\cos}}^{2}} +}} \\{{\left( {\frac{P}{4 \cdot \pi \cdot v} \cdot \frac{u_{v,\cos}}{u_{v,\cos}^{2} + u_{v,\sin}^{2}}} \right)^{2} \cdot \sigma_{u_{v,\sin}}^{2}}} \\{= {\left( {\frac{P}{4 \cdot \pi \cdot v} \cdot \frac{1}{u_{v,\cos}^{2} + u_{v,\sin}^{2}}} \right)^{2} \cdot {\left( {{u_{v,\sin}^{2} \cdot \sigma_{u_{v,\cos}}^{2}} + {u_{v,\cos}^{2} \cdot \sigma_{u_{v,\sin}}^{2}}} \right).}}}\end{matrix}$

In order to simplify the computation of σ_(u) _(ν,cos) ² and σ_(u)_(ν,sin) ² it is assumed that the asymmetry of the target issufficiently small, so that it can be neglected. Further, it is assumedthat the alignment target is positioned such that it is symmetricalaround x_(stage)=0. In this particular case the following identitieswill hold E _(A,ν)=E _(A,−ν) and E _(B,ν)=E _(B,−ν).

From the derivation presented earlier, the sum alignment detectorintensity can be derived as follows (simplified according to thosesimplifying assumptions):

$\begin{matrix}{{I_{D,{sum}}\left( x_{stage} \right)} = {\sum\limits_{{v = {- N}}{v \neq 0}}^{N}\begin{pmatrix}{{{\underset{\_}{E}}_{A,v}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,v}} + {{\underset{\_}{E}}_{B,{- v}}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{B,{- v}}} +} \\{{{\underset{\_}{E}}_{B,v}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{B,v}} + {{\underset{\_}{E}}_{A,{- v}}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,{- v}}} +} \\{2 \cdot {{Re}\begin{pmatrix}{\exp {\left( \frac{{- } \cdot 4 \cdot \pi \cdot v \cdot x_{stage}}{P} \right) \cdot}} \\\begin{pmatrix}{{{\underset{\_}{E}}_{B,{- v}}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,v}} +} \\{{\underset{\_}{E}}_{A,{- v}}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{B,v}}\end{pmatrix}\end{pmatrix}}}\end{pmatrix}}} \\{= {\sum\limits_{{v = {- N}}{v \neq 0}}^{N}\begin{pmatrix}{{{\underset{\_}{E}}_{A,v}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,v}} + {{\underset{\_}{E}}_{B,v}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{B,v}} +} \\{{{\underset{\_}{E}}_{B,v}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{B,v}} + {{\underset{\_}{E}}_{A,v}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,v}} +} \\{2 \cdot {{Re}\begin{pmatrix}{\exp {\left( \frac{{- } \cdot 4 \cdot \pi \cdot v \cdot x_{stage}}{P} \right) \cdot}} \\\begin{pmatrix}{{{\underset{\_}{E}}_{B,v}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,v}} +} \\{{\underset{\_}{E}}_{A,v}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{B,v}}\end{pmatrix}\end{pmatrix}}}\end{pmatrix}}} \\{= {\sum\limits_{{v = {- N}}{v \neq 0}}^{N}\begin{pmatrix}\begin{matrix}{{2 \cdot {\underset{\_}{E}}_{A,v}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,v}^{\;}} +} \\{{2 \cdot {\underset{\_}{E}}_{B,v}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{B,v}} +}\end{matrix} \\{4 \cdot {{Re}\begin{pmatrix}{\exp {\left( \frac{{- } \cdot 4 \cdot \pi \cdot v \cdot x_{stage}}{P} \right) \cdot}} \\\left( {{\underset{\_}{E}}_{B,y}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,v}} \right)\end{pmatrix}}}\end{pmatrix}}}\end{matrix}$

Note that one could equally start with the different detector intensitysignal. It turns out that the conclusion as to the variance σ_(x)_(align,ν) ² is unaffected, so just the sum detector as the example isconsidered here.

A detector gain scaling constant G can be defined as follows:

$G = \sqrt{\frac{I_{D}}{N}}$ $N = {\frac{I_{D}}{G^{2}}.}$

where N denotes a number of photon-electrons, by which it meant theproportion of photons that are converted into electrons so as to giverise to a signal in the detector. As the photon-electron arrival is aPoisson process, the instantaneous variance of the detector signalequals the number of instantaneous photoelectrons at the detector. Thisproperty allows computations of the variance σ_(I) _(D,sum) ²(x_(stage))of I_(D,sum)(x_(stage)) as follows:

${I_{D,{sum}}\left( x_{stage} \right)} = {\sum\limits_{{v = {- N}}{v \neq 0}}^{N}\begin{pmatrix}\begin{matrix}{{2 \cdot {\underset{\_}{E}}_{A,v}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,v}} +} \\{{2 \cdot {\underset{\_}{E}}_{B,v}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{B,v}} +}\end{matrix} \\{4 \cdot {{Re}\begin{pmatrix}{\exp {\left( \frac{{- } \cdot 4 \cdot \pi \cdot v \cdot x_{stage}}{P} \right) \cdot}} \\\left( {{\underset{\_}{E}}_{B,v}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,v}} \right)\end{pmatrix}}}\end{pmatrix}}$${N_{D,{sum}}\left( x_{stage} \right)} = {\frac{1}{G^{2}} \cdot {I_{D,{sum}}\left( x_{stage} \right)}}$${\sigma_{N_{D,{sum}}}^{2}\left( x_{stage} \right)} = {\frac{1}{G^{2}} \cdot {I_{D,{sum}}\left( x_{stage} \right)}}$σ_(I_(D, sum))²(x_(stage)) = G² ⋅ I_(D, sum)(x_(stage)).

Recalling the following expression:

${u_{0} + {\sum\limits_{v = 1}^{N}\; {u_{v,\cos} \cdot {\cos \left( \frac{4 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)}}} + {u_{v,\sin} \cdot {\sin \left( \frac{4 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)}}} \approx {I_{D}\left( x_{stage} \right)}$

and combining it with this one:

${I_{D,{sum}}\left( x_{stage} \right)} = {\sum\limits_{{v = {- N}}{v \neq 0}}^{N}\begin{pmatrix}{{{\underset{\_}{E}}_{A,v}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,v}} + {{\underset{\_}{E}}_{B,{- v}}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{B,{- v}}} +} \\{{{\underset{\_}{E}}_{B,v}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{B,v}} + {{\underset{\_}{E}}_{A,{- v}}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,{- v}}} +} \\{2 \cdot {{Re}\begin{pmatrix}{\exp {\left( \frac{{- } \cdot 4 \cdot \pi \cdot v \cdot x_{stage}}{P} \right) \cdot}} \\\begin{pmatrix}{{{\underset{\_}{E}}_{B,{- v}}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,v}} +} \\{{\underset{\_}{E}}_{A,{- v}}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{B,v}}\end{pmatrix}\end{pmatrix}}}\end{pmatrix}}$

it can be observed that the following two identities hold (in general):

$\mspace{20mu} {u_{0} \approx {\sum\limits_{\substack{v = {- N} \\ v \neq 0}}^{N\;}\; \begin{pmatrix}{{{\underset{\_}{E}}_{A,v}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,v}} +} \\{{{\underset{\_}{E}}_{B,{- v}}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{B,{- v}}} +} \\{{{\underset{\_}{E}}_{B,v}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{B,v}} +} \\{{{\underset{\_}{E}}_{A,{- v}}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,{- v}}} +}\end{pmatrix}}}$   and$\sqrt{u_{v,\cos}^{2} + u_{v,\sin}^{2}} \approx {2 \cdot {{{{{\underset{\_}{E}}_{B,{- v}}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,v}} + {{\underset{\_}{E}}_{A,{- v}}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{B,v}}}}.}}$

Hence U₀ and √{square root over (u_(ν,cos) ²+u_(ν,sin) ²)} can be usedan estimator for the intensity when actually measuring signals. If theabove two identities are simplified for the particular case describedabove (i.e. where hold and) then it yields:

$u_{0} \approx {2 \cdot {\sum\limits_{\substack{v = {- N} \\ v \neq 0}}^{N}\; \left( {{{\underset{\_}{E}}_{A,v}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,v}} + {{\underset{\_}{E}}_{B,v}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{B,v}}} \right)}}$ and$\sqrt{u_{v,\cos}^{2} + u_{v,\sin}^{2}} \approx {4 \cdot {{{{\underset{\_}{E}}_{B,v}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,v}}}.}}$

For later use, there is introduced a convenient shorthand notation forthe maximum number of photoelectrons in the sum detector intensitysignal as:

$N_{\max} = {{\max \left( {N_{D,{sum}}\left( x_{stage} \right)} \right)} = {\frac{1}{G^{2}} \cdot 4 \cdot {\sum\limits_{\substack{v^{\prime} = {- N} \\ v^{\prime} \neq 0}}^{N\;}\; {\left( {{{\underset{\_}{E}}_{A,v^{\prime}}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,v^{\prime}}} + {{\underset{\_}{E}}_{B,v^{\prime}}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{B,v^{\prime}}}} \right).}}}}$

The variances σ_(u) _(ν,cos) ² and σ_(u) _(ν,sin) ² can now be computedby summing the instantaneous variances of the alignment detector signalover one full period of the alignment signal, and taking energyconservation into account. The calculation goes as follows:

$\begin{matrix}\left\{ \begin{matrix}{\sigma_{u_{v,\sin}}^{2} = \frac{\int_{0}^{\frac{P}{2 \cdot v}}{{\sigma_{I_{D,{sum}}}^{2}\left( x_{stage} \right)} \cdot {{\sin \left( \frac{4 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)}} \cdot \ {x_{stage}}}}{\int_{0}^{\frac{P}{2 \cdot v}}{\left( {{{\sin \left( \frac{4 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)}} + {{\cos \left( \frac{4 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)}}} \right) \cdot \ {x_{stage}}}}} \\{\sigma_{u_{v,\cos}}^{2} = \frac{\int_{0}^{\frac{P}{2 \cdot v}}{{\sigma_{I_{D,{sum}}}^{2}\left( x_{stage} \right)} \cdot {{\cos \left( \frac{4 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)}} \cdot \ {x_{stage}}}}{\int_{0}^{\frac{P}{2 \cdot v}}{\left( {{{\sin \left( \frac{4 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)}} + {{\cos \left( \frac{4 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)}}} \right) \cdot \ {x_{stage}}}}}\end{matrix} \right. & \; \\\left\{ {\begin{matrix}{\sigma_{u_{v,\sin}}^{2} = \frac{\int_{0}^{\frac{P}{2 \cdot v}}{G^{2} \cdot {I_{D,{sum}}\left( x_{stage} \right)} \cdot {{\sin \left( \frac{4 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)}} \cdot \ {x_{stage}}}}{\int_{0}^{\frac{P}{2 \cdot v}}{\left( {{{\sin \left( \frac{4 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)}} + {{\cos \left( \frac{4 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)}}} \right) \cdot \ {x_{stage}}}}} \\{\sigma_{u_{v,\cos}}^{2} = \frac{\int_{0}^{\frac{P}{2 \cdot v}}{G^{2} \cdot {I_{D,{sum}}\left( x_{stage} \right)} \cdot {{\cos \left( \frac{4 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)}} \cdot \ {x_{stage}}}}{\int_{0}^{\frac{P}{2 \cdot v}}{\left( {{{\sin \left( \frac{4 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)}} + {{\cos \left( \frac{4 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)}}} \right) \cdot \ {x_{stage}}}}}\end{matrix}\left\{ \begin{matrix}{\sigma_{u_{v,\sin}}^{2} = {G^{2} \cdot \frac{\pi \cdot v}{2 \cdot P} \cdot {\int_{0}^{\frac{P}{2 \cdot v}}{{I_{D,{sum}}\left( x_{stage} \right)} \cdot {{\sin \left( \frac{4 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)}} \cdot \ {x_{stage}}}}}} \\{\sigma_{u_{v,\cos}}^{2} = {G^{2} \cdot \frac{\pi \cdot v}{2 \cdot P} \cdot {\int_{0}^{\frac{P}{2 \cdot v}}{{I_{D,{sum}}\left( x_{stage} \right)} \cdot {{\cos \left( \frac{4 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)}} \cdot \ {x_{stage}}}}}}\end{matrix} \right.} \right. & \;\end{matrix}$

The above two integrals can be evaluated indirectly by means ofnumerical Monte Carlo computations, to compute the variance σ_(x)_(,align,ν) ² directly for the generic case. The principles and practiceof such calculations are well known for the skilled reader. Broadlyspeaking, one makes copies of the alignment signal I_(D,sum)(x_(stage)),and adds noise to each copy. Next, for each noisy alignment signal, analignment position is computed, and finally the variance of thealignment position can be computed.

As an alternative to the numerical solution, it may be useful to have ananalytical “rule of thumb” expressing the relationship between thenumber of photons and the variance of the estimated, relative alignmentposition. To obtain this rule of thumb, it is also assumed that thealignment signal consists of first order diffraction information only.In this particular case, the above integrals can be simplified into:

$\begin{matrix}\left\{ \begin{matrix}{\sigma_{u_{{v = 1},\sin}}^{2} = \frac{\int_{0}^{\frac{P}{2}}{G^{2} \cdot {I_{D,{sum}}\left( x_{stage} \right)} \cdot {{\sin \left( \frac{4 \cdot \pi \cdot x_{stage}}{P} \right)}} \cdot \ {x_{stage}}}}{\int_{0}^{\frac{P}{2}}{\left( {{{\sin \left( \frac{4 \cdot \pi \cdot x_{stage}}{P} \right)}} + {{\cos \left( \frac{4 \cdot \pi \cdot x_{stage}}{P} \right)}}} \right) \cdot {x_{stage}}}}} \\{\sigma_{u_{{v = 1},\cos}}^{2} = \frac{\int_{0}^{\frac{P}{2}}{G^{2} \cdot {I_{D,{sum}}\left( x_{stage} \right)} \cdot {{\cos \left( \frac{4 \cdot \pi \cdot x_{stage}}{P} \right)}} \cdot {x_{stage}}}}{\int_{0}^{\frac{P}{2}}{\left( {{{\sin \left( \frac{4 \cdot \pi \cdot x_{stage}}{P} \right)}} + {{\cos \left( \frac{4 \cdot \pi \cdot x_{stage}}{P} \right)}}} \right) \cdot {x_{stage}}}}}\end{matrix} \right. & \; \\\left\{ \begin{matrix}{\sigma_{u_{{v = 1},\sin}}^{2} = \frac{\int_{0}^{\frac{P}{2}}{G^{2} \cdot {\sum\limits_{\substack{v^{\prime} = {- 1} \\ v^{\prime} \neq 0}}^{1}\; {\begin{pmatrix}{{2 \cdot {\underset{\_}{E}}_{A,v^{\prime}}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,v^{\prime}}} +} \\{{2 \cdot {\underset{\_}{E}}_{B,v^{\prime}}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{B,v^{\prime}}} +} \\{4 \cdot {{Re}\begin{pmatrix}{{\exp \left( \frac{{- } \cdot 4 \cdot \pi \cdot v^{\prime} \cdot x_{stage}}{P} \right)} \cdot} \\\left( {{\underset{\_}{E}}_{B,v^{\prime}}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,v^{\prime}}} \right)\end{pmatrix}}}\end{pmatrix} \cdot {{\sin \left( \frac{4 \cdot \pi \cdot x_{stage}}{P} \right)}} \cdot \ {x_{stage}}}}}}{\int_{0}^{\frac{P}{2}}{\left( {{{\sin \left( \frac{4 \cdot \pi \cdot x_{stage}}{P} \right)}} + {{\cos \left( \frac{4 \cdot \pi \cdot x_{stage}}{P} \right)}}} \right) \cdot {x_{stage}}}}} \\{\sigma_{u_{{v = 1},\cos}}^{2} = \frac{\int_{0}^{\frac{P}{2}}{G^{2} \cdot {\sum\limits_{\substack{v^{\prime} = {- 1} \\ v^{\prime} \neq 0}}^{1}\; {\begin{pmatrix}{{2 \cdot {\underset{\_}{E}}_{A,v^{\prime}}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,v^{\prime}}} +} \\{{2 \cdot {\underset{\_}{E}}_{B,v^{\prime}}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{B,v^{\prime}}} +} \\{4 \cdot {{Re}\begin{pmatrix}{{\exp \left( \frac{{- } \cdot 4 \cdot \pi \cdot v^{\prime} \cdot x_{stage}}{P} \right)} \cdot} \\\left( {{\underset{\_}{E}}_{B,v^{\prime}}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,v^{\prime}}} \right)\end{pmatrix}}}\end{pmatrix} \cdot {{\cos \left( \frac{4 \cdot \pi \cdot x_{stage}}{P} \right)}} \cdot \ {x_{stage}}}}}}{\int_{0}^{\frac{P}{2}}{\left( {{{\sin \left( \frac{4 \cdot \pi \cdot x_{stage}}{P} \right)}} + {{\cos \left( \frac{4 \cdot \pi \cdot x_{stage}}{P} \right)}}} \right) \cdot {x_{stage}}}}}\end{matrix} \right. & \; \\\left\{ \begin{matrix}{\sigma_{u_{{v = 1},\sin}}^{2} = \frac{G^{2} \cdot {\sum\limits_{\substack{v^{\prime} = {- 1} \\ v^{\prime} \neq 0}}^{1}\; {\begin{pmatrix}{{2 \cdot {\underset{\_}{E}}_{A,v^{\prime}}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,v^{\prime}}} +} \\{2 \cdot {\underset{\_}{E}}_{B,v^{\prime}}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{B,v^{\prime}}}\end{pmatrix} \cdot {\int_{0}^{\frac{P}{2}}{{{\sin \left( \frac{4 \cdot \pi \cdot x_{stage}}{P} \right)}} \cdot {x_{stage}}}}}}}{\int_{0}^{\frac{P}{2}}{\left( {{{\sin \left( \frac{4 \cdot \pi \cdot x_{stage}}{P} \right)}} + {{\cos \left( \frac{4 \cdot \pi \cdot x_{stage}}{P} \right)}}} \right) \cdot {x_{stage}}}}} \\{\sigma_{u_{{v = 1},\cos}}^{2} = \frac{G^{2} \cdot {\sum\limits_{\substack{v^{\prime} = {- 1} \\ v^{\prime} \neq 0}}^{1}\; {\begin{pmatrix}{{2 \cdot {\underset{\_}{E}}_{A,v^{\prime}}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,v^{\prime}}} +} \\{2 \cdot {\underset{\_}{E}}_{B,v^{\prime}}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{B,v^{\prime}}}\end{pmatrix} \cdot {\int_{0}^{\frac{P}{2}}{{{\cos \left( \frac{4 \cdot \pi \cdot x_{stage}}{P} \right)}} \cdot {x_{stage}}}}}}}{\int_{0}^{\frac{P}{2}}{\left( {{{\sin \left( \frac{4 \cdot \pi \cdot x_{stage}}{P} \right)}} + {{\cos \left( \frac{4 \cdot \pi \cdot x_{stage}}{P} \right)}}} \right) \cdot {x_{stage}}}}}\end{matrix} \right. & \; \\\left\{ \begin{matrix}{\sigma_{u_{v,\sin}}^{2} = {G^{2} \cdot {\sum\limits_{\substack{v^{\prime} = {- 1} \\ v^{\prime} \neq 0}}^{1}\; \left( {{{\underset{\_}{E}}_{A,v^{\prime}}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,v^{\prime}}} + {{\underset{\_}{E}}_{B,v^{\prime}}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{B,v^{\prime}}}} \right)}}} \\{\sigma_{u_{v,\cos}}^{2} = {G^{2} \cdot {\sum\limits_{\substack{v^{\prime} = {- 1} \\ v^{\prime} \neq 0}}^{1}\; {\left( {{{\underset{\_}{E}}_{A,v^{\prime}}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,v^{\prime}}} + {{\underset{\_}{E}}_{B,v^{\prime}}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{B,v^{\prime}}}} \right).}}}}\end{matrix} \right. & \;\end{matrix}$

In conclusion, there can be now stated a final variance of the estimatedalignment position, for the particular case in which the alignmentsignal consists of only the first diffraction order information, and thealignment mark is symmetrical about a zero value of the stage position.The variance of the estimated, relative alignment position equals:

$\sigma_{x_{{align},{v = 1}}}^{2} = {\left( {\frac{P}{4 \cdot \pi} \cdot \frac{1}{u_{{v = 1},\cos}^{2} + u_{{v = 1},\sin}^{2}}} \right)^{2} \cdot \left( {{u_{{v = 1},\sin}^{2} \cdot \sigma_{u_{{v = 1},\cos}}^{2}} + {u_{{v = 1},\cos}^{2} \cdot \sigma_{u_{{v = 1},\sin}}^{2}}} \right)}$$\sigma_{x_{{align},{v = 1}}}^{2} = {\left( \frac{P}{4 \cdot \pi} \right)^{2} \cdot \frac{1}{u_{{v = 1},\cos}^{2} + u_{{v = 1},\sin}^{2}} \cdot G^{2} \cdot {\sum\limits_{\substack{v^{\prime} = {- 1} \\ v \neq 0}}^{1}\; \left( {{{\underset{\_}{E}}_{A,v^{\prime}}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,v^{\prime}}} + {{\underset{\_}{E}}_{B,v^{\prime}}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{B,v^{\prime}}}} \right)}}$$\left\{ {\begin{matrix}{\sigma_{x_{{align},{v = 1}}}^{2} = {\left( \frac{P}{4 \cdot \pi} \right)^{2} \cdot \frac{1}{u_{{v = 1},\cos}^{2} + u_{{v = 1},\sin}^{2}} \cdot \sigma_{u_{{v = 1},\cos,\sin}}^{2}}} \\{\sigma_{u_{{v = 1},\cos,\sin}}^{2} = {G^{2} \cdot {\sum\limits_{\substack{v^{\prime} = {- 1} \\ v \neq 0}}^{1}\; \left( {{{\underset{\_}{E}}_{A,v^{\prime}}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,v^{\prime}}} + {{\underset{\_}{E}}_{B,v^{\prime}}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{B,v^{\prime}}}} \right)}}} \\{u_{0} \approx {2 \cdot {\sum\limits_{\substack{v = {- N} \\ v \neq 0}}^{N}\; \left( {{{\underset{\_}{E}}_{A,v}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,v}} + {{\underset{\_}{E}}_{B,v}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{B,v}}} \right)}}}\end{matrix}\mspace{20mu} \left\{ \begin{matrix}{\sigma_{x_{{align},{v = 1}}}^{2} = {\left( \frac{P}{4 \cdot \pi} \right)^{2} \cdot \frac{1}{u_{{v = 1},\cos}^{2} + u_{{v = 1},\sin}^{2}} \cdot \sigma_{u_{{v = 1},\cos,\sin}}^{2}}} \\{\sigma_{u_{{v = 1},\cos,\sin}}^{2} \approx {G^{2} \cdot {\frac{u_{0}}{2}.}}}\end{matrix} \right.} \right.$

As noted above, this last result holds for the intensity alignmentsignals from both the sum and difference detectors.

Note that as

${\cos\left( \frac{4 \cdot \pi \cdot v \cdot x_{stage}}{P} \right)}\mspace{14mu} {and}\mspace{14mu} \sin \mspace{14mu} \left( \frac{4 \cdot \pi \cdot v^{\prime} \cdot x_{stage}}{P} \right)$

will be mutually orthogonal on the scan trajectory interval, it can beconcluded that cov(u_(ν,cos),u_(ν′,cos))=0, cov(u_(ν,sin),u_(ν′,sin))=0and cov(u_(ν,cos),u_(ν′,sin))=0, for νε{1, . . . , N} and ν′ε{1, . . . ,N}, and ν≠ν′. Hence the covariance matrix C _(x) _(align,meas) ofmultiple alignment position estimates will be a diagonal matrix.Therefore the diagonal matrix C _(x) _(align,meas) can be assembled byplacing σ_(x) _(align,ν) ², on the appropriate corresponding (diagonal)location. The above equations for the variances can be expressed indifferent forms, according to the nomenclature and conventions of theenvironment in which they are being used. In one embodiment, forexample, they can be rewritten in the form:

$\sigma_{x_{{align},{v = 1}}}^{2} = {\left( \frac{P}{2 \cdot \pi} \right)^{2} \cdot \frac{1}{N_{\max}}}$$\sigma_{x_{{align},{v = 1}}} = {\frac{P}{2 \cdot \pi} \cdot {\frac{1}{\sqrt{N_{\max}}}.}}$

To this, one can make use of the identity

${{4 \cdot {{{\underset{\_}{E}}_{B,{v = 1}}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,{v = 1}}}}} \approx {2 \cdot \left( {{{\underset{\_}{E}}_{A,{v = 1}}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,{v = 1}}} + {{\underset{\_}{E}}_{B,{v = 1}}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{B,{v = 1}}}} \right)}},$

which is valid if γ=22.5° and is purely x-polarized or y-polarized., , ,. This result can be verified numerically to confirm that the “rule ofthumb” calculation agrees with the full numerical solution.

Incidentally, in a case where the asymmetry measuring arrangement 460and the alignment sensor both work in parallel and share the sameilluminator, the same detector integration time (i.e. effective scanlength) applies. A lot of the calculations and derivations of resultscan be common to the different arrangements. It can also be arrange thatthe asymmetry sensor and the alignment sensor have the same scalingconstant. Other noise sources can be taken into account, if they areknown. For example, sensor electronics noise and/or mechanical vibrationcan be taken into account.

As seen above, a plethora of different alignment position measurementsare in fact obtained from the position-varying intensity signalscaptured by detectors 430A, 430B. A different measurement x_(align)(λ₀,E_(S),ν) can be obtained for each combination of color (λ₀), polarization(E _(S)) decomposed order (ν). There are other ways to derive a singleposition measurement x_(align) from these multiple measurements, besidesjust selecting a ‘best’ one of the waveforms and orders. Rather thandiscarding all but the “best”, one can use an average of themeasurements as the single result. Various different averages can beused, which can also be referred to as “location estimators”. Theseinclude means, medians, weighted means, or weighted medians. Outlierscan be discarded also. Rank based estimators such as Hodges-Lehmannestimators may be used. The average can be weighted in some way, if therelative quality of the different measurements x_(align)(λ₀,E _(S),ν)can be identified. Computed above are the accompanying variance of thesemeasurements σ_(align) ²(λ₀,E _(S),ν), which can be used for suchweighting. Recall that the measurements are (assumed to be)uncorrelated. In the present apparatus, asymmetry corrections areapplied to obtain corrected versions of the numerous positionmeasurements, before the “best” single position measurement iscalculated. While in principle the concept just described is to use“all” the measurements instead of discarding all but the “best” one,hybrid approaches are possible in which multiple measurements are usedin the calculation after discarding some number of measurements that arejudged to be the “worst”. This may be done for example to reduceprocessing effort. In addition, one or more statistical techniques maybe applied such as trimming (discarding outliers) or Winsorizing′(adjusting outliers to fall within a predetermined percentile), beforean average result is calculated.

Referring to steps S45 and S46, the present embodiment obtainsadditional information from the detector sum and/or difference waveformsto supplement the information used to reconstruct the alignment targetasymmetry. In particular, there is disclosed the optional use of(estimated) intensity |E _(L,ν) ^(H)·E _(U,ν)| of the periodiccomponents of various orders. This is not to be confused with theintensity as seen by either detector 430A, 430B. The following resultfrom the above:

$\sqrt{u_{v,\cos}^{2} + u_{v,\sin}^{2}} \approx {4 \cdot {{{\underset{\_}{E}}_{B,v}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,v}}}}$

allows us to calculate the intensity of each periodic component asmeasured by the apparatus, which can be compared with the modeledintensity to refine the model. The steps S45, S46 are optional andfurther discussion of their implementation is deferred until later inthis description.

Calculation of Refined Asymmetry Measurement

FIG. 19 illustrates a simple target model for an alignment grating withperiodicity in the X direction. In this model, a number M of distinctlayers L(m) are defined, ranging from a “superstrate” SUP layer (Layer1) which is effectively the free space above the target, to a substrateSUB (Layer M) which is, for example, in the bulk of substrate W, belowthe target. Parameter n_(m) denotes the complex index of refraction ofthe material in each individual layer, while h_(m) denote the height ofeach individual layer. Parameter n_(g) denotes the complex index ofrefraction of the material forming the target grating. The geometry of asingle grating line is represented by four vertices (x_(ν),z_(ν)),νε{1,2,3,4}. This geometry is understood to be repeated with spatialperiod P to the left and right of the diagram. It will be appreciatedthat a more complex grating profile will involve more vertices torepresent it νε{1, . . . V}. Note that if there is partial overlapbetween an (input) layer and the grating, then additional layers can beused to discretize the grating (using a stair casing approach) pervertex to vertex z-interval, and one or two extra layer(s) will be usedto model the non-overlapping part of the layer.

All of these parameters of the materials and geometry of the layers andthe grating structure forming the alignment target in combinationconstitute the model that is the basis of calculating the measuredalignment position, and one or more properties of the target structure.Parameters of the model can be set to fixed values, while others areallowed to “float” for the purposes of reconstruction. Parameters can bederived from combinations of other parameters. Critical dimension, sidewall angle and/or the like are all parameters that can be derived fromthe vertex positions (x_(ν),z_(ν)). A particular derived parameter iswhat it is called asymmetry, and can be defined in a variety of ways, tosuit the application. Whatever one or more parameters are used, the oneor more floating (unknown) parameters can be summarized by a columnvector denoted as p.

Referring now to step S47 of FIG. 11, how an asymmetry measurement isrefined in step S4 is described, using the position signals derived fromthe detectors 430A, 430B. As mentioned already, this technique is basedon using the alignment position information as measured in differentcomponent signals, to estimate/reconstruct the alignment targetasymmetry by refining the parameterized model. Making use of thisadditional information can be beneficial for two reasons, even if directasymmetry measurements have already been made using a dedicatedasymmetry sensor (arrangement 460). First, it can make use of extradifferent information and reduce (unknown, floating) parametercorrelations in the target model upon which the alignment positionmeasurement is based. Secondly, making use of all the alignment positionsignals from detectors 430A, 430B increases the total number of photonsused, and hence it will reduce the impact of the photon Poisson noise.

An asymmetry estimation/reconstruction problem can be expressed bydefining a residual function of measurements made in step S3 by theasymmetry measuring arrangement 460. The asymmetry measurementarrangement 460 (referred to as the “asymmetry sensor” for short) can beof any type, for example of the type described in US 61/684,006mentioned above, or of a type forming a pupil image for angle-resolvedscatterometry. A detailed understanding of the asymmetry sensor is notnecessary for an understanding of the present subject.

The residual function can be defined as follows:

${{\underset{\_}{R}\left( \underset{\_}{p} \right)} = \begin{bmatrix}{{\underset{\_}{\underset{\_}{C}}}_{{\underset{\_}{I}}_{D,{asymm},{meas}}}^{- \frac{1}{2}} \cdot \left( {{{\underset{\_}{I}}_{D,{asymm},{model}}\left( \underset{\_}{p} \right)} - {\underset{\_}{I}}_{D,{asymm},{meas}}} \right)} \\{{\underset{\_}{\underset{\_}{C}}}_{\Delta \; {\underset{\_}{x}}_{{align},{meas}}}^{- \frac{1}{2}} \cdot \left( {{\Delta \; {{\underset{\_}{x}}_{{align},{model}}\left( \underset{\_}{p} \right)}} - {\Delta \; {\underset{\_}{x}}_{{align},{meas}}}} \right)}\end{bmatrix}},$

where the column vector I _(D,asymm,meas) denotes all measuredintensities from detectors in the asymmetry sensor, the column vector I_(D,asymm,model) (p) denotes all modeled intensities of the samedetectors, the column vector p denotes the unknown (floating) parametersof the alignment target model, the (diagonal) matrix C _(I)_(D,asymm,meas) denotes the covariance matrix of all measured asymmetrysensor intensities, the column vector Δx _(align,meas) denotes allpairwise differences of measured alignment positions (from step S2), thecolumn vector Δx _(align,model) (p) denotes all pairwise differences ofmodeled alignment positions, and the matrix C _(Δx) _(align,meas)denotes the covariance matrix of all pairwise differences of measuredalignment positions. Note that the matrix C _(Δx) _(align,meas) is notnecessarily a diagonal matrix. Note that the residual R(p) as definedabove, will itself have a covariance matrix which equals the identitymatrix. In an embodiment not having asymmetry measuring arrangement 460,the residual function would contain only the second covariance matrix C_(Δx) _(align,meas) , so that the function is based entirely onmeasurements of alignment position via detectors 430A, 430B. In thepresent embodiment, the alignment positions and modeled alignmentpositions that have been calculated from the waveforms for differentcolor/polarization combinations and different orders are used as theinput also to the step S4 for refining the asymmetry measurement. Moregenerally speaking, information of the various orders from the multiplewaveforms is used, whether it is already expressed in the form of theposition measurements produced in step S2, or is in some other form.Also note that while, in this example, pairwise differences arecalculated by simple subtraction of modeled positions, the invention isnot limited to this specific interpretation of difference. Differencebetween positions (or position-related information) can be expressed inother forms, without departing from the scope of the invention. Forexample, differences between measurements can be expressed by ratios.

The column vector pairwise differences of measured alignment positionsΔx _(align,meas) is defined as

${{\Delta {\underset{\_}{x}}_{{align},{meas}}} = \begin{bmatrix}{\Delta \; x_{{align},{v = 1},{meas}}} \\M \\{\Delta \; x_{{align},{v = N},{meas}}}\end{bmatrix}},$

in which the difference Δx_(align,ν,meas) of measured alignmentpositions between any two component signals is defined as

Δx _(align,ν,meas) =x _(align,ν,meas)(λ_(0,j) ,E _(S,j))−x_(align,ν,meas)(λ_(0,m) ,E _(S,m)),

where λ_(0,j) denotes the illumination wavelength, for measurement j,and

${\underset{\_}{E}}_{S,j} = \begin{bmatrix}E_{S,{\hat{\underset{\_}{x}}}_{P}} \\E_{S,{\hat{\underset{\_}{y}}}_{P}}\end{bmatrix}$

denotes the electric field at source level, for measurement j, in the{circumflex over (x)} _(P) and ŷ _(P) polarization coordinate system.Note that one could also compute all pairwise differences of thealignment position, between different diffraction orders ν_(j)≠ν_(m).This will increase the total number of differences, but the implementershould be aware that much of this information is correlated, and henceexpanding the number of differences above a certain point may be oflimited use.

In addition to the measured positions, one then takes account ofpredictions of measured positions, obtained from the model. Thecovariance matrix C _(Δx) _(align,meas) can be computed from C_(xalign,meas) as follows:

${{\underset{\_}{\underset{\_}{C}}}_{\Delta \; {\underset{\_}{x}}_{{align},{meas}}} = {\frac{{\partial\Delta}\; {\underset{\_}{x}}_{{align},{model}}}{\partial{\underset{\_}{x}}_{{align},{model}}} \cdot {\underset{\_}{\underset{\_}{C}}}_{{\underset{\_}{x}}_{{align},{meas}}} \cdot \left( \frac{{\partial\Delta}\; {\underset{\_}{x}}_{{align},{model}}}{\partial{\underset{\_}{x}}_{{align},{model}}} \right)^{T}}},$

where

$\frac{{\partial\Delta}\; {\underset{\_}{x}}_{{align},{model}}}{\partial{\underset{\_}{x}}_{{align},{model}}}$

denotes the Jacobian matrix of derivatives with respect to the alignmentpositions, of the pairwise differences of modeled alignment positions.Typically this matrix will be a sparse matrix with one 1 and one −1entry per row only. Hence C _(Δx) _(align,meas) is not necessarily adiagonal matrix, as some of the pairwise differences of modeledalignment positions can be correlated. Hence the implementer can tradeoff the use of all possible pairwise differences of modeled alignmentpositions, versus the difficulty of computing the Cholesky decompositionC _(Δx) _(align,meas) ⁻¹=(C _(Δx) _(align,meas) ^(−1/2))^(T)·C _(Δx)_(align,meas) ^(−1/2), which is needed in the proposed implementation ofthe asymmetry estimation/reconstruction, discussed below. One solutionfor this is to include all possible combinations ofΔx_(align,ν,meas)=x_(align,ν,meas)(λ_(0,n),E_(S,n))−x_(align,ν,meas)(λ_(0,m),E _(S,m)), while reducing computationalcomplexity by approximating the matrix C _(Δx) _(align,meas) by itsdiagonal only. Alternatively one could include only the uncorrelateddifferences Δx_(align,ν,meas).

To perform the step S47, the asymmetry estimation/reconstruction problemcan be posed in the following terms:

$\quad\left\{ \begin{matrix}{{\underset{\_}{p}}_{{asymm},{estimated}} = {\arg \mspace{11mu} \min {{\underset{\_}{R}\left( \underset{\_}{p} \right)}}_{2}^{2}}} \\{{\underset{\_}{R}\left( \underset{\_}{p} \right)} = {\begin{bmatrix}{{\underset{\_}{\underset{\_}{C}}}_{{\underset{\_}{I}}_{D,\; {asymm},{meas}}}^{- \frac{1}{2}} \cdot \left( {{{\underset{\_}{I}}_{D,\; {asymm},{model}}\left( \underset{\_}{p} \right)} - {\underset{\_}{I}}_{D,\; {asymm},{meas}}} \right)} \\{{\underset{\_}{\underset{\_}{C}}}_{\Delta \; {\underset{\_}{x}}_{{align},{meas}}}^{- \frac{1}{2}} \cdot \left( {{\Delta \; {{\underset{\_}{x}}_{{align},{model}}\left( \underset{\_}{p} \right)}} - {\Delta \; {\underset{\_}{x}}_{{align},{meas}}}} \right)}\end{bmatrix}.}}\end{matrix} \right.$

In other words, the task is to use the calculated covariance matrices,as weighting matrices, to minimize the residual function R(p), and henceto obtain the result p _(asymm,estimated) which is a best estimate ofthe set of parameters of the target model (model of the periodicstructure forming the alignment mark 202, etc.). When the model isdefined to include one or more asymmetry related parameters, the vectorp _(asymm,estimated) includes our estimate of asymmetry. This non-linearminimization problem can be solved efficiently, using algorithms knownto those skilled in the art, for example Newton minimization approaches.The resulting set of parameters p includes the refined asymmetrymeasurement as one of the parameters, in whatever form of expression isdesired. Needless to say, any other parameters that are unknown can alsobe measured by allowing them to float in the model while theminimization of the residual is performed. For example, the target tiltsρ _(ŷ) _(P) and ρ _({circumflex over (x)}) _(P) may be an example ofparameters that can be allowed to float in the model, and hence bemeasured in this way.

Referring again to steps S45 and S46, if it is desired to make use ofthe intensity

$4 \cdot {{{\underset{\_}{E}}_{B,v}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,v}}}$

to provide additional information for the alignment target asymmetryestimation, the residual function used in S47 can be modified to be asfollows:

${\underset{\_}{R}\left( \underset{\_}{p} \right)} = \begin{bmatrix}{{\underset{\_}{\underset{\_}{C}}}_{{\underset{\_}{I}}_{D,{asymm},{meas}}}^{- \frac{1}{2}} \cdot \left( {{{\underset{\_}{I}}_{D,{asymm},{model}}\left( \underset{\_}{p} \right)} - {\underset{\_}{I}}_{D,{asymm},{meas}}} \right)} \\{{\underset{\_}{\underset{\_}{C}}}_{\begin{matrix}{\Delta \; {\underset{\_}{x}}_{{align},{meas}}} \\{4 \cdot {{{\underset{\_}{E}}_{B,v}^{H} \cdot \frac{1}{2} \cdot {\lbrack\begin{matrix}1 & 1 \\1 & 1\end{matrix}\rbrack} \cdot {\underset{\_}{E}}_{A,v}}}_{meas}}\end{matrix}}^{- \frac{1}{2}} \cdot \begin{bmatrix}{{\Delta \; {{\underset{\_}{x}}_{{align},{model}}\left( \underset{\_}{p} \right)}} - {\Delta \; {\underset{\_}{x}}_{{align},{meas}}}} \\{{4 \cdot {{{\underset{\_}{E}}_{B,v}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,v}}}_{model}} - {4 \cdot {{{\underset{\_}{E}}_{B,v}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,v}}}_{meas}}}\end{bmatrix}}\end{bmatrix}$

where

${\underset{\_}{\underset{\_}{C}}}_{4 \cdot {\begin{matrix}{\Delta {\underset{\_}{x}}_{{align},{meas}}} \\{{\underset{\_}{E}}_{B,v}^{H} \cdot \frac{1}{2} \cdot {\lbrack\begin{matrix}1 & 1 \\1 & 1\end{matrix}\rbrack} \cdot {\underset{\_}{E}}_{A,v}}\end{matrix}}_{meas}}$

denotes the covariance matrix of the column vector

$\begin{bmatrix}{\Delta \; {\underset{\_}{x}}_{{align},{meas}}} \\{4 \cdot {{{\underset{\_}{E}}_{B,v}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,v}}}_{meas}}\end{bmatrix}.$

Note that this matrix is not a diagonal matrix as Δx _(align,meas) and

$4 \cdot {{{\underset{\_}{E}}_{B,v}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,v}}}_{meas}$

are mutually correlated.To avoid costly and complex computations to compute the covariancematrix, one can approximate it by its diagonal. If so, only σ_(Δx)_(align, meas) ² and

$\sigma_{4 \cdot {{{\underset{\_}{E}}_{B,v}^{H} \cdot \frac{1}{2} \cdot {\lbrack\begin{matrix}1 & 1 \\1 & 1\end{matrix}\rbrack} \cdot {\underset{\_}{E}}_{A,v}}}}^{2}$

are computed. The computation of σ_(Δx) _(align,meas) ² was discussedalready above in the context of calculating the position variances (stepS44).

In order to compute

$\sigma_{4 \cdot {{{\underset{\_}{E}}_{B,v}^{H} \cdot \frac{1}{2} \cdot {\lbrack\begin{matrix}1 & 1 \\1 & 1\end{matrix}\rbrack} \cdot {\underset{\_}{E}}_{A,v}}}}^{2}$

in step S46 as follows, the following derivatives are computed:

$\left\{ {\begin{matrix}{\frac{{\partial 2} \cdot {{{\underset{\_}{E}}_{\alpha,{- v}}^{H} \cdot {\underset{\_}{E}}_{\alpha,v}}}}{\partial u_{v,\cos}} = {\frac{\partial\;}{\partial u_{v,\cos}} \cdot \sqrt{u_{v,\cos}^{2} + u_{v,\sin}^{2}}}} \\{\frac{{\partial 2} \cdot {{{\underset{\_}{E}}_{\alpha,{- v}}^{H} \cdot {\underset{\_}{E}}_{\alpha,v}}}}{\partial u_{v,\sin}} = {\frac{\partial\;}{\partial u_{v,\sin}} \cdot \sqrt{u_{v,\cos}^{2} + u_{v,\sin}^{2}}}}\end{matrix}\left\{ \begin{matrix}{\frac{{\partial 2} \cdot {{{\underset{\_}{E}}_{\alpha,{- v}}^{H} \cdot {\underset{\_}{E}}_{\alpha,v}}}}{\partial u_{v,\cos}} = \frac{u_{v,\cos}}{\sqrt{u_{v,\cos}^{2} + u_{v,\sin}^{2}}}} \\{\frac{{\partial 2} \cdot {{{\underset{\_}{E}}_{\alpha,{- v}}^{H} \cdot {\underset{\_}{E}}_{\alpha,v}}}}{\partial u_{v,\sin}} = {\frac{u_{v,\sin}}{\sqrt{u_{v,\cos}^{2} + u_{v,\sin}^{2}}}.}}\end{matrix} \right.} \right.$

Following the same reasoning as in step S44 the following result for thevariance of the estimated intensity

$4 \cdot {{{\underset{\_}{E}}_{B,v}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,v}}}$

can be derived:

$\sigma_{4 \cdot {{{\underset{\_}{E}}_{B,v}^{H} \cdot \frac{1}{2} \cdot {\lbrack\begin{matrix}1 & 1 \\1 & 1\end{matrix}\rbrack} \cdot {\underset{\_}{E}}_{A,v}}}}^{2} = {{\frac{{\partial 4} \cdot {{{\underset{\_}{E}}_{B,v}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,v}}}}{\partial u_{v,\; \cos}} \cdot \sigma_{u_{v,\; \cos}}^{2} \cdot \frac{{\partial 4} \cdot {{{\underset{\_}{E}}_{B,v}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,v}}}}{\partial u_{v,\; \cos}}} + {\frac{{\partial 4} \cdot {{{\underset{\_}{E}}_{B,v}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,v}}}}{\partial u_{v,\; \sin}} \cdot \sigma_{u_{v,\; \sin}}^{2} \cdot \frac{{\partial 4} \cdot {{{\underset{\_}{E}}_{B,v}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,v}}}}{\partial u_{v,\; \sin}}}}$$\sigma_{4 \cdot {{{\underset{\_}{E}}_{B,v}^{H} \cdot \frac{1}{2} \cdot {\lbrack\begin{matrix}1 & 1 \\1 & 1\end{matrix}\rbrack} \cdot {\underset{\_}{E}}_{A,v}}}}^{2} = {{\left( \frac{u_{v,\; \cos}}{\sqrt{u_{v,\; \cos}^{2} + u_{v,\; \sin}^{2}}} \right)^{2} \cdot \sigma_{u_{v,\; \cos}}^{2}} + {\left( \frac{u_{v,\; \sin}}{\sqrt{u_{v,\; \cos}^{2} + u_{v,\; \sin}^{2}}} \right)^{2} \cdot \sigma_{u_{v,\; \sin}}^{2}}}$$\sigma_{4 \cdot {{{\underset{\_}{E}}_{B,v}^{H} \cdot \frac{1}{2} \cdot {\lbrack\begin{matrix}1 & 1 \\1 & 1\end{matrix}\rbrack} \cdot {\underset{\_}{E}}_{A,v}}}}^{2} = {\frac{1}{u_{v,\; \cos}^{2} + u_{v,\; \sin}^{2}} \cdot {\left( {{u_{v,\; \cos}^{2} \cdot \sigma_{u_{v,\; \cos}}^{2}} + {u_{v,\; \sin}^{2} \cdot \sigma_{u_{v,\; \sin}^{2}}}} \right).}}$

where the variances σ_(u) _(ν,cos) ² and σ_(u) _(ν,sin) ² are computedas before. For the case in which the alignment mark is symmetrical abouta zero position and the alignment signal consists only of the firstdiffraction order, the same lines of reasoning as discussed above can beused to simplify the expression for the variance of each (estimated)intensity

$4 \cdot {{{\underset{\_}{E}}_{B,v}^{H} \cdot \frac{1}{2} \cdot \begin{bmatrix}1 & 1 \\1 & 1\end{bmatrix} \cdot {\underset{\_}{E}}_{A,v}}}$

into:

$\left\{ {\begin{matrix}{\sigma_{4 \cdot {{{\underset{\_}{E}}_{B,v}^{H} \cdot \frac{1}{2} \cdot {\lbrack\begin{matrix}1 & 1 \\1 & 1\end{matrix}\rbrack} \cdot {\underset{\_}{E}}_{A,v}}}} = \sigma_{u_{{v = 1},\cos,\sin}}^{2}} \\{\sigma_{u_{{v = 1},\cos,\sin}}^{2} \approx {G^{2} \cdot {\frac{u_{0}}{2}.}}}\end{matrix}.} \right.$

Referring then to step S48 in FIG. 11, the minimization process resultsin a model with parameters p which include an improved measurement ofasymmetry of the target grating.

Calculation of Corrected Position Measurements

Returning to FIG. 10, the refined asymmetry measurement is then appliedin step S5 to the correction of the numerous position measurementsobtained in step S2/S43. Then follows in step S6 the selection orcombination of these to obtain a single “best” position measurement.These steps will now be described.

Given all measured alignment positions x _(align,meas) and all modeledalignment positions x _(align,model) (p) asymmetry corrected alignmentpositions x _(align,corrected) can be computed as follows:

$\begin{bmatrix}x_{1} \\x_{2} \\M \\x_{Q}\end{bmatrix}_{{align},{corrected}} = {\begin{bmatrix}x_{1} \\x_{2} \\M \\x_{Q}\end{bmatrix}_{{align},{meas}} - \left( {\begin{bmatrix}{x_{1}\left( \underset{\_}{p} \right)} \\{x_{2}\left( \underset{\_}{p} \right)} \\M \\{x_{Q}\left( \underset{\_}{p} \right)}\end{bmatrix}_{{align},{model}} - {x_{{align},{model},{reference}}\left( {x_{v},z_{v}} \right)}} \right)}$${{\underset{\_}{x}}_{{align},{corrected}} = {{\underset{\_}{x}}_{{align},{meas}} - {\left( {{{\underset{\_}{x}}_{{align}.{model}}\left( \underset{\_}{p} \right)} - {x_{{align},{model},{reference}}\left( {x_{v},z_{v}} \right)}} \right).}}},$

In this equation, Qε{1, 2, 3, 4, . . . } denotes the total number ofalignment position measurements (i.e. for all combinations ofillumination color and polarization and all Fourier components) andx_(align,model,reference) (x_(ν), z_(ν)) denotes an alignment referencepoint x-position. This alignment reference point(x_(align,model,reference),z_(align,model,reference)) can be defined asa function of the grating vertices (x_(ν),z_(ν)) in the model (FIG. 19).So the point (x_(align,model,reference),z_(align,model,reference))denotes the alignment reference point, which one can now select suchthat its position is most relevant for the actual device patterns, whichone wants to align in the lithographic process. This is a particularlyinteresting facility when one appreciates that the grating is only adistorted version of some ideal profile, and therefore any referenceposition that one might choose (e.g. the center of the mark) is neverprecisely defined. Note that the term (x_(align,model)(p)−x_(align,model,reference) (x_(ν), z_(ν))) denotes themodeled (i.e. taking the alignment grating asymmetry into account)alignment position shift, between the modeled alignment positions andthe alignment reference point.

Now in step S6, given the set of corrected alignment positions x_(align,corrected), one can compute one single, efficient and robustalignment position estimate, using an appropriate statistical techniquefor selection or averaging of the candidate measurements. For this, thevariances calculated in steps S44 and S46 can be used to assign a higherweighting or rank to the measurements with the highest reliability.Various different averages can be used, which can also be referred to as“location estimators”. These include means, medians, weighted means, orweighted medians. Outliers can be discarded also. Rank based estimatorssuch as Hodges-Lehmann estimators may be used. Note that thisfunctionality is comparable to the “Color Dynamics” functionality. As afurther refinement, a weighted Hodges-Lehmann location estimator willresult in an estimation of the alignment position estimate, in which allinformation is being used (i.e. all photons are being used), but whichis robust against outliers.

To end this description, computation of the alignment positionmeasurement variances σ_(align,corrected) ² as a measure of the improvedquality of the position measurements obtained by the method herein isdiscussed. Start by recalling the following equation from above:

x _(align,corrected) =x _(align,meas)−( x _(align,model)( p )−x_(align,model,reference)(x _(ν) ,z _(ν)))x _(align,corrected) =x_(align,meas) −Δx _(align,correction).

where use of the following shorthand notation has been made:

Δ x _(align,correction) =x _(align,model)( p )−x_(align,model,reference)(x _(ν) ,z _(ν))

Now recall from the discussion of step S44 above that all alignmentposition measurements x_(align,meas) are mutually uncorrelated (at leastfor the photon Poisson noise component of the uncertainty). It alsoassumed that x _(align,meas) and (x _(align.model) (p)x_(align,model,reference) (x_(ν),z_(ν))) are not correlated. This is areasonable assumption in the above embodiment where the asymmetrymeasurement comes predominantly from the asymmetry sensor (arrangement460) and therefore makes use of different photons than the positionmeasurement. Hence the variance of x _(align,corrected) can be computedusing σ _(align,corrected) ²=σ _(align,meas) ²+σ _(align,correction),where σ _(align,correction) ² denotes the variance of Δx_(align.correction)=x_(align.model)(p)−x_(align,model,reference)(x_(ν),z_(ν)). In otherwords, the variance of the measured position after correction is greaterthan before correction. At first sight, it would appear thatconsequently the corrected measurement is inferior to the uncorrectedone. However, it should be remembered that the variance relates only tothe reproducibility of the measurement, and the greater aim is toeliminate or at least reduce systematic errors in the positionmeasurements, caused by absent or inaccurate knowledge of asymmetry inthe target grating. Therefore, provided that the additional variance issmaller than the systematic gain in accuracy, an overall benefit isachieved.

In order to quantify the additional variance, some calculations andsimulations are made for different stacks. These indicate that, for thecolor(s) that already has the best reproducibility (i.e. lowest standarddeviation σ _(x) _(align,meas) ), the additional deviation σΔx_(align.correction) is in the ratio

$\frac{\sigma_{\Delta \; {\underset{\_}{x}}_{{align}.{correction}}}}{\sigma_{{\underset{\_}{x}}_{{align},{meas}}}} \approx {\frac{1}{3}.}$

Therefore the reduction in reproducibility of the final measurement isonly modest, and this disadvantage can easily be outweighed by thereduction in systematic error. Note that it is assumed that the opticstransmission of the asymmetry branch and the optics transmission of thealignment branch are equal. Note that it is also assumed here that 16wavelengths are used to estimate the target asymmetry, while only onewavelength (i.e. the one with the best signal quality for the particulartarget) is used to estimate the alignment target position.

Note that the systematic gain in accuracy and the additional deviationcan be calculated and compared, before deciding to use the correctedmeasurement. In other words, in circumstances where the additionalvariance is larger than the systematic gain in accuracy, the correctioncan be discarded. The decision to discard the correction is somethingthat can be determined either beforehand (i.e. when defining the recipefor particular targets), or in real time (i.e. in response to dataobserved while measuring).

CONCLUSION

The above disclosure described how measurements of a property such asasymmetry can be derived by comparing a number of different results thatall are derivable from position dependent signals existing in thealignment sensor. Some of these signals are results related to theposition of the mark, and may for example be position measurementsproduced using different colors, polarizations and/or different spatialfrequency components of position-dependent optical signals detected inthe alignment sensor. Other results can be considered, for example theintensity values of the signals related to position, to obtain furtherinformation on the structure property. The information from theseresults may be combined with other measurements of the property, forexample made by a separate measuring branch operating with the sameillumination arrangement as the alignment sensor.

It should be understood that the processing unit PU which controlsalignment sensor, processes signals detected by it, and calculates fromthese signals position measurements suitable for use in controlling thelithographic patterning process, will typically involve a computerassembly of some kind, which will not be described in detail. Thecomputer assembly may be a dedicated computer external to the apparatus,it may be a processing unit or units dedicated to the alignment sensorand/or it may be a central control unit LACU controlling thelithographic apparatus as a whole. The computer assembly may be arrangedfor loading a computer program product comprising computer executablecode. This may enable the computer assembly, when the computer programproduct is downloaded, to control aforementioned uses of a lithographicapparatus with the alignment sensor AS.

Although specific reference may be made in this text to the use oflithographic apparatus in the manufacture of ICs, it should beunderstood that the lithographic apparatus described herein may haveother applications, such as the manufacture of integrated opticalsystems, guidance and detection patterns for magnetic domain memories,flat-panel displays, liquid-crystal displays (LCDs), thin-film magneticheads, etc. The skilled artisan will appreciate that, in the context ofsuch alternative applications, any use of the terms “wafer” or “die”herein may be considered as synonymous with the more general terms“substrate” or “target portion”, respectively. The substrate referred toherein may be processed, before or after exposure, in for example atrack (a tool that typically applies a layer of resist to a substrateand develops the exposed resist), a metrology tool and/or an inspectiontool. Where applicable, the disclosure herein may be applied to such andother substrate processing tools. Further, the substrate may beprocessed more than once, for example in order to create a multi-layerIC, so that the term substrate used herein may also refer to a substratethat already contains multiple processed layers.

Although specific reference may have been made above to the use ofembodiments of the invention in the context of optical lithography, itwill be appreciated that the invention may be used in otherapplications, for example imprint lithography, and where the contextallows, is not limited to optical lithography. In imprint lithography atopography in a patterning device defines the pattern created on asubstrate. The topography of the patterning device may be pressed into alayer of resist supplied to the substrate whereupon the resist is curedby applying electromagnetic radiation, heat, pressure or a combinationthereof. The patterning device is moved out of the resist leaving apattern in it after the resist is cured.

The terms “radiation” and “beam” used herein encompass all types ofelectromagnetic radiation, including ultraviolet (UV) radiation (e.g.having a wavelength of or about 365, 355, 248, 193, 157 or 126 nm) andextreme ultra-violet (EUV) radiation (e.g. having a wavelength in therange of 5-20 nm), as well as particle beams, such as ion beams orelectron beams.

The term “lens”, where the context allows, may refer to any one orcombination of various types of optical components, includingrefractive, reflective, magnetic, electromagnetic and electrostaticoptical components.

While specific embodiments of the invention have been described above,it will be appreciated that the invention may be practiced otherwisethan as described. For example, the invention may take the form of acomputer program containing one or more sequences of machine-readableinstructions describing a method as disclosed above, or a data storagemedium (e.g. semiconductor memory, magnetic or optical disk) having sucha computer program stored therein.

The descriptions above are intended to be illustrative, not limiting.Thus, it will be apparent to one skilled in the art that modificationsmay be made to the invention as described without departing from thescope of the claims set out below.

1. A method of measuring a property of a structure, the methodcomprising: illuminating the structure with radiation and detectingradiation diffracted by the structure using a detector; processingsignals representing the diffracted radiation to obtain a plurality ofresults related to a position of the structure, each result having thesame form but being influenced in a different way by a variation in theproperty; and calculating a measurement of the property of the structurethat is at least partially based on a difference observed among theplurality of results.
 2. The method of claim 1, wherein the plurality ofresults includes results based on illumination and detection ofradiation at different wavelengths.
 3. The method of claim 1, whereinthe plurality of results includes results based on illumination anddetection of radiation at different polarizations.
 4. The method ofclaim 1, wherein the plurality of results includes results based ondifferent spatial frequencies within a position-dependent signalreceived by the detector.
 5. The method of claim 4, wherein thestructure has a form that is substantially periodic in one or moredirections, and the different spatial frequencies correspond todifferent orders of diffraction by the periodic structure.
 6. The methodof claim 1, wherein the calculating the measurement of the property usesthe difference in combination with another result obtained usingradiation diffracted by the structure, but not related to the positionof the structure.
 7. The method of claim 6, wherein the other result isobtained using another detector processing a different portion of theradiation diffracted by the structure at the same time as the detectingthe radiation diffracted by the structure using the detector.
 8. Themethod of claim 6, wherein the other result includes a result obtainedfrom the same signals as the results related to the position of thestructure.
 9. The method of claim 1, wherein the property is anasymmetry related parameter of the structure.
 10. A method of measuringthe position of a periodic structure, the method comprising measuring aproperty of the structure using the method of any of claim 1 and furthercomprising: calculating a measurement of the position of the structureusing one or more of the plurality of results corrected in accordancewith the measurement of the property.
 11. The method of claim 10,wherein calculating the measurement of the position comprises applyingcorrections to two or more of the plurality of the results using themeasurement of the property, followed by calculating the positionmeasurement using one or more of the corrected results.
 12. The methodof claim 10, wherein calculating the measurement of the positioncomprises calculating a quality measure for each of the plurality ofresults and using the quality measures to determine to what degree eachresult contributes to the position measurement.
 13. A method ofmanufacturing devices wherein a device pattern is applied to a substrateusing a lithographic process, the method including positioning theapplied pattern by reference to measured position of a periodicstructure formed on the substrate, the measured position obtained by themethod of claim
 10. 14. A lithographic apparatus comprising: apatterning subsystem configured to transfer a pattern to a substrate; ameasuring subsystem configured to measure a position of the substrate inrelation to the patterning subsystem, wherein the patterning subsystemis arranged to use the position measured by the measuring subsystem toapply the pattern at a desired position on the substrate and wherein themeasuring subsystem is configured to measure the position of thesubstrate using a periodic structure on the substrate and measure theposition of the periodic structure by: illuminating the periodicstructure with radiation and detecting radiation diffracted by theperiodic structure using a detector; processing signals representing thediffracted radiation to obtain a plurality of results related to aposition of the periodic structure, each result having the same form butbeing influenced in a different way by a variation in the property;calculating a measurement of the property of the periodic structure thatis at least partially based on a difference observed among the pluralityof results; and calculating a measurement of the position of theperiodic structure using one or more of the plurality of resultscorrected in accordance with the measurement of the property.
 15. Anapparatus to measure a position of a structure, the apparatuscomprising: a detecting arrangement configured to detect radiationdiffracted by the structure using a detector; a processing arrangementconfigured to process signals representing the diffracted radiation toobtain a plurality of results related to a position of the structure,each result having the same form but being influenced in a different wayby variation in a property of the structure; a calculating arrangementconfigured to calculate a position of the structure using one or more ofthe results obtained by the processing arrangement, wherein thecalculating arrangement is configured to include a correction in thecalculated position in accordance with a measurement of the property ofthe structure, and wherein the calculating arrangement is configured tocalculate the measurement of the property of the structure at leastpartially on the basis of a difference observed among the plurality ofresults.
 16. The apparatus of claim 15, further comprising anilluminating arrangement arranged to illuminate the structure withradiation of a plurality of wavelengths, and wherein the detectingarrangement is configured to detect separately the radiation of theplurality of wavelengths and wherein the plurality of results obtainedby the processing arrangement include a plurality of results obtainedusing radiation of different wavelengths.
 17. The apparatus of claim 15,wherein the plurality of results obtained by the processing arrangementinclude a plurality of results corresponding to different diffractionorders in the diffracted radiation.
 18. The apparatus of claim 17,arranged to scan the structure with the radiation and wherein thedetecting arrangement includes an interferometer configured to generatea position dependent signal that varies as the structure is scanned withthe radiation, and wherein the plurality of results corresponding todifferent diffraction orders are obtained by extracting differentspatial frequency components from the position dependent signal.